Rust的S3 SDK日期时间与Chrono一起使用吗?

huangapple go评论71阅读模式
英文:

Do Rust S3 SDK Datetimes work with Chrono?

问题

我正在编写一个CLI应用程序,使用AWS的SDK作为我的第一个真正的Rust项目,用于恢复S3中删除和覆盖的对象版本。

其中一部分是允许用户传递开始和结束日期,以确定应撤消哪些文件更改。
因此,我已经编写了这个函数来从用户输入中解析NaiveDateTime(来自Chrono Crate):

fn create_option_datetime_from_string(input: String) -> Option<DateTime<Utc>> {
    let date_regex = Regex::new(r"^\d{4}-\d{2}-\d{2}$").unwrap();
    if date_regex.is_match(&input) {
        let parsed_date: ParseResult<NaiveDateTime> = NaiveDateTime::parse_from_str(&input, "%Y-%m-%d");
        if parsed_date.is_ok() {
            let naive_date = parsed_date.unwrap_or_default();
            let utc_date: DateTime<Utc> = DateTime::<Utc>::from_utc(naive_date, Utc);
            return Some(utc_date);
        }
    }
    return None;
}

我还有一个从S3中获取对象版本数据的函数,类似于以下内容:

struct ObjectVersionsFetchResponse {
    next_key_marker: Option<String>,
    versions: Vec<ObjectVersion>,
}

async fn fetch_object_versions_from_s3<'a>(
    client: &'a s3::Client,
    bucket: &'a str,
    limit: Option<i32>,
    prefix: &'a String,
    key_marker: Option<String>,
) -> Result<ObjectVersionsFetchResponse, SdkError<ListObjectVersionsError>> {
    let resp = client
        .list_object_versions()
        .bucket(bucket)
        .set_max_keys(limit)
        .prefix(prefix)
        .set_key_marker(key_marker)
        .send()
        .await?;

    /* 如果这是Some,则还有更多对象需要获取,因此我们需要进行另一个请求 */
    let next_key_marker: Option<&str> = resp.next_key_marker();
    /* 从技术上讲,我们不需要一个向量,因为响应保持在固定长度,也许 TODO 修复*/
    let versions: Vec<ObjectVersion> = resp.versions().unwrap_or_default().to_vec();
    /* 我们返回包括版本标记在内的数据,以允许在获取后进行处理 */
    Ok(ObjectVersionsFetchResponse {
        next_key_marker: next_key_marker.map(|s| s.to_string()),
        versions,
    })
}

这些函数预期会给我两件事情:

  • 可能是从用户输入解析出的日期时间
  • 从AWS获取的对象版本列表

现在我正试图编写一个函数,只获取在时间范围内的所有对象版本

为了实现这一点,我创建了一个结构体来保存解析的时间范围:

#[derive(Debug)]
struct Timeframe {
    start: Option<DateTime<Utc>>,
    end: DateTime<Utc>,
}

以及以下函数,希望能够比较ObjectVersions中的日期时间和用户输入解析的时间范围:

fn filter_object_versions(object_versions: &Vec<ObjectVersion>, timeframe: Timeframe) -> Vec<&ObjectVersion> {
    println!("object_versions: {:?}", object_versions.len());
    let filtered_object_versions: Vec<_> = object_versions
        .into_par_iter()
        .filter(|object_version| {
            let is_latest = object_version.is_latest;
            let last_modified = object_version.last_modified.as_ref().unwrap_or(&Utc::now());
            let is_after_start = last_modified > timeframe.start.unwrap_or_else(|| Utc::now());
            let is_before_end = last_modified < timeframe.end;
            return is_after_start && is_before_end && is_latest;
        })
        .collect();
    println!("filtered_object_versions: {:?}", filtered_object_versions.len());

    return filtered_object_versions;
}

遗憾的是,这个比较并不如预期工作。
由于我的Timeframe属性是来自Chrono的NaiveDateTime类型,它们无法与S3 SDK似乎使用的"Smithy Datetime"进行比较。

现在我正在寻求如何最好地进行这种比较的建议。

英文:

I am writing a CLI application to "restore" deleted and overwritten object versions in S3 using the SDK from AWS as my first "real" Rust Project.

One part of this is allowing the user to pass in a start and end date between which file changes should be undone.
As such i've written this function to parse a NaiveDateTime (from the Chrono Crate) from the users input:

fn create_option_datetime_from_string(input: String) -&gt; Option&lt;DateTime&lt;Utc&gt;&gt; {
    let date_regex = Regex::new(r&quot;^\d{4}-\d{2}-\d{2}$&quot;).unwrap();
    if date_regex.is_match(&amp;input) {
        let parsed_date: ParseResult&lt;NaiveDateTime&gt; = NaiveDateTime::parse_from_str(&amp;input, &quot;%Y-%m-%d&quot;);
        if parsed_date.is_ok() {
            let naive_date = parsed_date.unwrap_or_default();
            let utc_date: DateTime&lt;Utc&gt; = DateTime::&lt;Utc&gt;::from_utc(naive_date, Utc);
            return Some(utc_date);
        }
    }
    return None;
}

I also have a function to fetch object version data from S3 that looks something like this:

struct ObjectVersionsFetchResponse {
    next_key_marker: Option&lt;String&gt;,
    versions: Vec&lt;ObjectVersion&gt;,
}

async fn fetch_object_versions_from_s3&lt;&#39;a&gt;(client: &amp;&#39;a s3::Client, bucket: &amp;&#39;a str, limit: Option&lt;i32&gt;, prefix: &amp;&#39;a String, key_marker: Option&lt;String&gt;) -&gt; Result&lt;ObjectVersionsFetchResponse, SdkError&lt;ListObjectVersionsError&gt;&gt; {
    let resp = client.list_object_versions().bucket(bucket).set_max_keys(limit).prefix(prefix).set_key_marker(key_marker).send().await?;

    /* If this is Some, there are more objects to be fetched so we need to do another request */
    let next_key_marker: Option&lt;&amp;str&gt; = resp.next_key_marker();
    /* Technically we don&#39;t need a vector as the response stays at a fixed length, maybe TODO fix*/
    let versions: Vec&lt;ObjectVersion&gt; = resp.versions().unwrap_or_default().to_vec();
    /* We return the data including the version marker to the calling method to allow for after-fetching */
    Ok(ObjectVersionsFetchResponse {
        next_key_marker: next_key_marker.map(|s| s.to_string()),
        versions,
    })
}

These functions are expected to give me two things:

  • Possibly a parsed datetime from the users input
  • A list of object versions from AWS

Now im trying to write a function to only grab all object versions that fit the timeframe

To achieve this, i've made a struct to save the parsed Timeframe

#[derive(Debug)]
struct Timeframe {
    start: Option&lt;DateTime&lt;Utc&gt;&gt;,
    end: DateTime&lt;Utc&gt;,
}

And this function to hopefully compare the DateTimes in the ObjectVersions and the Timeframes that were parsed by user Input:

fn filter_object_versions(object_versions: &amp;Vec&lt;ObjectVersion&gt;, timeframe: Timeframe) -&gt; Vec&lt;&amp;ObjectVersion&gt; {
    println!(&quot;object_versions: {:?}&quot;, object_versions.len());
    let filtered_object_versions: Vec&lt;_&gt; = object_versions
        .into_par_iter()
        .filter(|object_version| {
            let is_latest = object_version.is_latest;
            let last_modified = object_version.last_modified.as_ref().unwrap_or(&amp;Utc::now());
            let is_after_start = last_modified &gt; timeframe.start.unwrap_or_else(|| Utc::now());
            let is_before_end = last_modified &lt; timeframe.end;
            return is_after_start &amp;&amp; is_before_end &amp;&amp; is_latest;
        })
        .collect();
    println!(&quot;filtered_object_versions: {:?}&quot;, filtered_object_versions.len());

    return filtered_object_versions;
}

Sadly this comparison does not work as expected.
As my Timeframe properties are of type NaiveDateTime from Chrono they can't be compared to the "Smithy Datetime" that the S3 SDK does appear to use.

I'm now looking for advice on how to best do this comparison.

答案1

得分: 1

Amazon为这些类型的转换创建了一个crate。它叫做aws-smithy-types-convert

只需像这样将它添加到你的Cargo.toml中:

[dependencies]
aws-smithy-types-convert = { version = "0.56.1", features = ["convert-chrono"] }

然后你可以在比较之前将smithy DateTime转换为chrono DateTime。

use aws_smithy_types_convert::date_time::DateTimeExt;

//...

        .filter(|object_version| {
            let is_latest = object_version.is_latest;
            let last_modified = object_version.last_modified
                .map(|t| t.to_chrono_utc())
                .unwrap_or(Utc::now());
            let is_after_start = last_modified > timeframe.start.unwrap_or_else(|| Utc::now());
            let is_before_end = last_modified < timeframe.end;
            return is_after_start && is_before_end && is_latest;
        })

话虽如此,将其转换为“UTC” chrono datetime的代码实际上只是复制了秒和微秒,没有任何时区处理。所以要么S3对于所有操作都使用UTC,要么它们只是输出一个说它是UTC的时间戳,但实际上不是。你可以自己查明。

另外,你的解包似乎不正确。我会过滤掉所有没有时间戳的对象,而不是为它们创建一个假的 UTC::now() 时间戳。最后,我会返回一个 impl Iterator 而不是一个 Vec。如果API消费者将其用作迭代器或希望将其用作任何其他类型的集合,这将减少分配(如果列表很大,则显著减少!)。

fn filter_object_versions(object_versions: &Vec<ObjectVersion>, timeframe: Timeframe) -> impl Iterator<Item = &ObjectVersion> {
    object_versions
        .into_par_iter()
        .filter(|version| version.is_latest)
        .filter_map(|version| version.last_modified) //删除所有为None的版本
        .map(|last_modified| last_modified.to_chrono_utc())
        .filter(|last_modified| {
            let is_after_start = last_modified > timeframe.start.unwrap_or_else(|| Utc::now());
            let is_before_end = last_modified < timeframe.end;
            is_after_start && is_before_end
        })
}
英文:

Amazon made a crate for these types of conversions. Its called aws-smithy-types-convert.

Just add it to your Cargo.toml like so:

[dependencies]
aws-smithy-types-convert = { version = &quot;0.56.1&quot;, features = [&quot;convert-chrono&quot;] }

Then you can turn the smithy DateTime into a chrono DateTime before comparing.

use aws_smithy_types_convert::date_time::DateTimeExt;

//...

        .filter(|object_version| {
            let is_latest = object_version.is_latest;
            let last_modified = object_version.last_modified
                .map(|t| t.to_chrono_utc())
                .unwrap_or(Utc::now());
            let is_after_start = last_modified &gt; timeframe.start.unwrap_or_else(|| Utc::now());
            let is_before_end = last_modified &lt; timeframe.end;
            return is_after_start &amp;&amp; is_before_end &amp;&amp; is_latest;
        })

That being said, the code to convert to a "UTC" chrono datetime really just copies over the seconds and microseconds without any timezone handling. So either S3 uses UTC for everything or they just spit out a timestamp that says it is UTC, but isn't. You get to find out.

Also your unwrapping doesn't seem right to me. I would filter out all objects that don't have a timestamp instead of creating a fake UTC::now() timestamp for them. And finally, I would return an impl Iterator instead of a Vec. It reduces allocation (significantly if its a big list!) in the case that the api consumer uses it as an iterator or wants it as any other type of collection.

fn filter_object_versions(object_versions: &amp;Vec&lt;ObjectVersion&gt;, timeframe: Timeframe) -&gt; Impl Iterator&lt;&amp;ObjectVersion&gt; {
    object_versions
        .into_par_iter()
        .filter(|version| version.is_latest)
        .filter_map(|version| version.last_modified) //removes all versions that are None
        .map(|last_modified| last_modified.to_chrono_utc())
        .filter(|last_modified| {
            let is_after_start = last_modified &gt; timeframe.start.unwrap_or_else(|| Utc::now());
            let is_before_end = last_modified &lt; timeframe.end;
            is_after_start &amp;&amp; is_before_end
        })
}

huangapple
  • 本文由 发表于 2023年7月10日 15:16:42
  • 转载请务必保留本文链接:https://go.coder-hub.com/76651472.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定