Convert str to NaiveDate datatype in rust polars

huangapple go评论62阅读模式
英文:

Convert str to NaiveDate datatype in rust polars

问题

我需要将一系列字符串值转换为 Polars 日期格式。

使用文档作为灵感: https://pola-rs.github.io/polars-book/user-guide/concepts/data-structures/#dataframe,我编写了以下函数:

pub fn convert_str_to_date(s: Series) -> Option<Series> {
    Some(
        s.iter()
            .map(|v| {
                let date = v.to_string().replace(",", "").replace("\\\"", "");
                let year = &date[0..4];
                let month = &date[4..6];
                let day = &date[6..8];
                NaiveDate::from_ymd_opt(year.parse::<i32>().unwrap(), month.parse::<u32>().unwrap(), day.parse::<u32>().unwrap()).unwrap()
            }).collect()
    )
}

为了验证函数是否正常工作,我编写了这个测试:

#[test]
fn test_convert_str_to_date() {
    let test_s = Series::new("", &["20230101", "20230630", "20220229"]);
    let transformed_s = convert_str_to_date(test_s).unwrap();
    let expected_s = Series::new(
        "",
        &[NaiveDate::from_ymd_opt(2023, 1, 1).unwrap(), NaiveDate::from_ymd_opt(2023, 6, 30).unwrap(), NaiveDate::from_ymd_opt(2022, 2, 29).unwrap()],
    );
    assert_eq!(transformed_s, expected_s);
}

我期望测试通过。然而,我从编译器接收到以下错误:

error[E0277]: a value of type `polars::prelude::Series` cannot be built from an iterator over elements of type `NaiveDate`
    --> src/transform.rs:106:16
     |
106  |             }).collect()
     |                ^^^^^^^ value of type `polars::prelude::Series` cannot be built from `std::iter::Iterator<Item=NaiveDate>`
     |
     = help: the trait `FromIterator<NaiveDate>` is not implemented for `polars::prelude::Series`
     = help: the following other types implement trait `FromIterator<A>`:
               <polars::prelude::Series as FromIterator<&'a bool>>
               <polars::prelude::Series as FromIterator<&'a f32>>
               <polars::prelude::Series as FromIterator<&'a f64>>
               <polars::prelude::Series as FromIterator<&'a i32>>
               <polars::prelude::Series as FromIterator<&'a i64>>
               <polars::prelude::Series as FromIterator<&'a str>>
               <polars::prelude::Series as FromIterator<&'a u32>>
               <polars::prelude::Series as FromIterator<&'a u64>>
             and 15 others
note: the method call chain might not have had the expected associated types
英文:

I need to convert a Series of str values to Polars Date format.

Using the documentation as inspiration: https://pola-rs.github.io/polars-book/user-guide/concepts/data-structures/#dataframe, I've written the following function:

pub fn convert_str_to_date(s: Series) -&gt; Option&lt;Series&gt; {
    Some(
        s.iter()
            .map(|v| {
                let date = v.to_string().replace(&quot;,&quot;, &quot;&quot;).replace(&quot;\&quot;&quot;, &quot;&quot;);
                let year = &amp;date[0..4];
                let month = &amp;date[4..6];
                let day = &amp;date[6..8];
                NaiveDate::from_ymd_opt(year.parse::&lt;i32&gt;().unwrap(), month.parse::&lt;u32&gt;().unwrap(), day.parse::&lt;u32&gt;().unwrap()).unwrap()
            }).collect()
    )
}

To validate the function is working, I've written this test:

    #[test]
    fn test_convert_str_to_date() {
        let test_s = Series::new(&quot;&quot;, &amp;[&quot;20230101&quot;, &quot;20230630&quot;, &quot;20220229&quot;]);
        let transformed_s = convert_str_to_date(test_s).unwrap();
        let expected_s = Series::new(
            &quot;&quot;,
            &amp;[NaiveDate::from_ymd_opt(2023, 1, 1).unwrap(), NaiveDate::from_ymd_opt(2023, 6, 30).unwrap(), NaiveDate::from_ymd_opt(2022, 2, 29).unwrap()],            
        );
        assert_eq!(transformed_s, expected_s);
    }

I expect the test to pass. However, I receive the following error from the compiler:

error[E0277]: a value of type `polars::prelude::Series` cannot be built from an iterator over elements of type `NaiveDate`
    --&gt; src/transform.rs:106:16
     |
106  |             }).collect()
     |                ^^^^^^^ value of type `polars::prelude::Series` cannot be built from `std::iter::Iterator&lt;Item=NaiveDate&gt;`
     |
     = help: the trait `FromIterator&lt;NaiveDate&gt;` is not implemented for `polars::prelude::Series`
     = help: the following other types implement trait `FromIterator&lt;A&gt;`:
               &lt;polars::prelude::Series as FromIterator&lt;&amp;&#39;a bool&gt;&gt;
               &lt;polars::prelude::Series as FromIterator&lt;&amp;&#39;a f32&gt;&gt;
               &lt;polars::prelude::Series as FromIterator&lt;&amp;&#39;a f64&gt;&gt;
               &lt;polars::prelude::Series as FromIterator&lt;&amp;&#39;a i32&gt;&gt;
               &lt;polars::prelude::Series as FromIterator&lt;&amp;&#39;a i64&gt;&gt;
               &lt;polars::prelude::Series as FromIterator&lt;&amp;&#39;a str&gt;&gt;
               &lt;polars::prelude::Series as FromIterator&lt;&amp;&#39;a u32&gt;&gt;
               &lt;polars::prelude::Series as FromIterator&lt;&amp;&#39;a u64&gt;&gt;
             and 15 others
note: the method call chain might not have had the expected associated types

答案1

得分: 0

polarsSeries 上没有针对 NaiveDate 迭代器的 FromIterator 实现。

你可以使用 From<Logical<DateType, Int32Type>>Series 上实现,然后将迭代器传递给 Logical::from_naive_date,然后将其转换为 Series

pub fn convert_str_to_date(s: Series) -> Option<Series> {
    Some(
        Logical::from_naive_date(
            "column_name",
            s.iter().map(|v| {
                let date = v.to_string().replace(",", "").replace("\"", "");
                let year = &date[0..4];
                let month = &date[4..6];
                let day = &date[6..8];
                NaiveDate::from_ymd_opt(
                    year.parse::<i32>().unwrap(),
                    month.parse::<u32>().unwrap(),
                    day.parse::<u32>().unwrap(),
                )
                .unwrap()
            }),
        )
        .into(),
    )
}

或者,你可以将 NaiveDate 收集到一个 Vec 中,然后使用 Series::new 构造 Series

pub fn convert_str_to_date2(s: Series) -> Option<Series> {
    let naive_dates: Vec<_> = s
        .iter()
        .map(|v| {
            let date = v.to_string().replace(",", "").replace("\"", "");
            let year = &date[0..4];
            let month = &date[4..6];
            let day = &date[6..8];
            NaiveDate::from_ymd_opt(
                year.parse::<i32>().unwrap(),
                month.parse::<u32>().unwrap(),
                day.parse::<u32>().unwrap(),
            )
            .unwrap()
        })
        .collect();
    Some(Series::new("column_name", &naive_dates))
}
英文:

polars does not have a FromIterator implementation on Series from an iterator of NaiveDates.

From&lt;Logical&lt;DateType, Int32Type&gt;&gt; is implemented on Series so you can pass the iterator to Logical::from_naive_date and then convert it to a Series:

pub fn convert_str_to_date(s: Series) -&gt; Option&lt;Series&gt; {
    Some(
        Logical::from_naive_date(
            &quot;column_name&quot;,
            s.iter().map(|v| {
                let date = v.to_string().replace(&quot;,&quot;, &quot;&quot;).replace(&quot;\&quot;&quot;, &quot;&quot;);
                let year = &amp;date[0..4];
                let month = &amp;date[4..6];
                let day = &amp;date[6..8];
                NaiveDate::from_ymd_opt(
                    year.parse::&lt;i32&gt;().unwrap(),
                    month.parse::&lt;u32&gt;().unwrap(),
                    day.parse::&lt;u32&gt;().unwrap(),
                )
                .unwrap()
            }),
        )
        .into(),
    )
}

Alternatively, you can collect the NaiveDates into a Vec, and then use Series::new to construct the Series:

pub fn convert_str_to_date2(s: Series) -&gt; Option&lt;Series&gt; {
    let naive_dates: Vec&lt;_&gt; = s
        .iter()
        .map(|v| {
            let date = v.to_string().replace(&quot;,&quot;, &quot;&quot;).replace(&quot;\&quot;&quot;, &quot;&quot;);
            let year = &amp;date[0..4];
            let month = &amp;date[4..6];
            let day = &amp;date[6..8];
            NaiveDate::from_ymd_opt(
                year.parse::&lt;i32&gt;().unwrap(),
                month.parse::&lt;u32&gt;().unwrap(),
                day.parse::&lt;u32&gt;().unwrap(),
            )
            .unwrap()
        })
        .collect();
    Some(Series::new(&quot;column_name&quot;, &amp;naive_dates))
}

答案2

得分: 0

以下是代码的翻译部分:

pub fn convert_str_to_date(s: &Series) -> Result<Series, PolarsError> {
    let mut s = s
        .utf8()?
        .replace_all(",|", "")?
        .as_date(Some("%Y%m%d"), true)?
        .into_series();
    s.rename("column_name");
    Ok(s)
}

fn main() {
    let test_s = Series::new("", &["2023,0101", "202306,30", "20220229"]);
    let transformed_s = convert_str_to_date(&test_s).unwrap();

    println!("{:?}", transformed_s);
}
形状: (3,)
Series: 'column_name' [日期]
[
        2023-01-01
        2023-06-30
        空值
]
英文:

Simpler than the accepted answer:

pub fn convert_str_to_date(s: &amp;Series) -&gt; Result&lt;Series, PolarsError&gt; {
    let mut s = s
        .utf8()?
        .replace_all(r#&quot;,|&quot;&quot;#, &quot;&quot;)?
        .as_date(Some(&quot;%Y%m%d&quot;), true)?
        .into_series();
    s.rename(&quot;column_name&quot;);
    Ok(s)
}

fn main() {
    let test_s = Series::new(&quot;&quot;, &amp;[&quot;2023,0101&quot;, &quot;202306,30&quot;, &quot;20220229&quot;]);
    let transformed_s = convert_str_to_date(&amp;test_s).unwrap();

    println!(&quot;{:?}&quot;, transformed_s);
}
shape: (3,)
Series: &#39;column_name&#39; [date]
[
        2023-01-01
        2023-06-30
        null
]

huangapple
  • 本文由 发表于 2023年5月21日 08:47:07
  • 转载请务必保留本文链接:https://go.coder-hub.com/76297868.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定