2023年6月11日 23:22:03go评论88阅读模式

英文:

DATA INGESTION -TypeError: cannot unpack non-iterable NoneType object

问题

I am getting this error in data ingestion part (training pipeline). I am trying to run trainining_pipeline.py and this error shows up.

Full traceback:

Traceback (most recent call last):
  File "src\pipelines\training_pipeline.py", line 12, in <module>
    train_data_path, test_data_path = obj.initiate_data_ingestion()
TypeError: cannot unpack non-iterable NoneType object

data_ingestion.py:

import os
import sys
import pandas as pd
from src.logger import logging
from src.exception import CustomException
from src.components.data_ingestion import DataIngestion
if __name__=='__main__':
    obj = DataIngestion()
    train_data_path, test_data_path = obj.initiate_data_ingestion()
    print(train_data_path, test_data_path)

training_pipeline.py:

import os
import sys
from src.exception import CustomException
from src.logger import logging
import pandas as pd
from sklearn.model_selection import train_test_split
from dataclasses import dataclass
@dataclass
class DataIngestionconfig:
    train_data_path=os.path.join('artifacts','train.csv')
    test_data_path=os.path.join('artifacts','test.csv')
    raw_data_path=os.path.join('artifacts','raw.csv')
class DataIngestion:
    def __init__(self):
        self.ingestion_config=DataIngestionconfig()
    def initiate_data_ingestion(self):
        logging.info('Data Ingestion method starts')
        try:
            df=pd.read_csv(os.path.join('notebooks/data','gemstone.csv'))
            logging.info('Dataset read as pandas Dataframe')
            os.makedirs(os.path.dirname(self.ingestion_config.raw_data_path),exist_ok=True)
            df.to_csv(self.ingestion_config.raw_data_path,index=False)
            logging.info("Train test split")
            train_set,test_set = train_test_split(df,test_size=0.30,random_state=42)
            train_set.to_csv(self.ingestion_config.train_data_path,index=False,header=True)
            test_set.to_csv(self.ingestion_config.test_data_path,index=False,header=True)
            logging.info('Ingestion of data is completed')
            
            return(
                self.ingestion_config.train_data_path,
                self.ingestion_config.test_data_path
            ) 
        except Exception as e:
            logging.info('Error occured in Data Ingestion config')
if __name__=="__main__":
    obj=DataIngestion()
    train_data_path,test_data_path=obj.initiate_data_ingestion()

I tried returning the two values as a list but that didn't work as well.

英文:

I am getting this error in data ingestion part (training pipeline). I am trying to run trainining_pipeline.py and this error shows up.

Full traceback:

Traceback (most recent call last):
  File &quot;src\pipelines\training_pipeline.py&quot;, line 12, in &lt;module&gt;
    train_data_path,test_data_path = obj.initiate_data_ingestion()
TypeError: cannot unpack non-iterable NoneType object

data_ingestion.py:

import os
import sys
import pandas as pd
from src.logger import logging
from src.exception import CustomException
from src.components.data_ingestion import DataIngestion
if __name__==&#39;__main__&#39;:
    obj = DataIngestion()
    train_data_path,test_data_path = obj.initiate_data_ingestion()
    print(train_data_path,test_data_path)

training_pipeline.py:

import os
import sys
from src.exception import CustomException
from src.logger import logging
import pandas as pd
from sklearn.model_selection import train_test_split
from dataclasses import dataclass
## intialize the data ingestion configuration
@dataclass
class DataIngestionconfig:
    train_data_path=os.path.join(&#39;artifacts&#39;,&#39;train.csv&#39;)
    test_data_path=os.path.join(&#39;artifacts&#39;,&#39;test.csv&#39;)
    raw_data_path=os.path.join(&#39;artifacts&#39;,&#39;raw.csv&#39;)
## create a data ingestion class
class DataIngestion:
    def __init__(self):
        self.ingestion_config=DataIngestionconfig()
    def initiate_data_ingestion(self):
        logging.info(&#39;Data Ingestion method starts&#39;)
        try:
            df=pd.read_csv(os.path.join(&#39;notebooks/data&#39;,&#39;gemstone.csv&#39;))
            logging.info(&#39;Dataset read as pandas Dataframe&#39;)
            os.makedirs(os.path.dirname(self.ingestion_config.raw_data_path),exist_ok=True)
            df.to_csv(self.ingestion_config.raw_data_path,index=False)
            logging.info(&quot;Train test split&quot;)
            train_set,test_set = train_test_split(df,test_size=0.30,random_state=42)
            train_set.to_csv(self.ingestion_config.train_data_path,index=False,header=True)
            test_set.to_csv(self.ingestion_config.test_data_path,index=False,header=True)
            logging.info(&#39;Ingestion of data is completed&#39;)
            
            return(
                self.ingestion_config.train_data_path,
                self.ingestion_config.test_data_path
            ) 
        except Exception as e:
            logging.info(&#39;Error occured in Data Ingestion config&#39;)
if __name__==&quot;__main__&quot;:
    obj=DataIngestion()
    train_data_path,test_data_path=obj.initiate_data_ingestion()

I tried returning the two values as a list but that did'nt work as well.

答案1

得分: 1

initiate_data_ingestion 在没有异常时返回一个 tuple，但如果有异常则返回 None（return 语句在 try 块内，而在 except 块内没有 return 语句，在 Python 中这等同于 return None）。同时，函数调用方始终期望输出一个元组。这是错误的根本原因。

正确处理异常的方式是确保函数始终返回正确的类型（在本例中是一个元组），而不管是否发生异常。

一般来说，你不应该使用一个捕获所有异常的 try...except 块，因为这会导致问题，就像这个问题一样：你捕获了一个你不知道是什么的 Exception，所以你的程序不知道如何处理它，这会导致更多的错误。

你的解决方法是删除 try...except 块，看看真正的错误是什么，然后重写代码，只捕获你已经预期并知道如何处理的异常。

英文:

initiate_data_ingestion returns a tuple when there is no Exception, but None if there is (the return statement is inside the try block, while there is no return statement in the except block, in Python it's the same as to return None). Meanwhile the function caller always expects a tuple as an output. That's the source of the error.

The correct way to handle exceptions is to make sure the function always returns the correct type (in this case, a tuple) regardless of whether an exception occurs.

In general you should never do a catch-all try...except block, because it leads to problems like this: you catch an Exception that you don't know what it is, so your program doesn't know how to handle it, and and it leads to more errors down the line.

Your solutions is to remove the try...except block to see what the real error is and fix that. Then rewrite the code to only catch exceptions that you already expect and know how to handle.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

数据摄入 – 类型错误：无法解包非可迭代的 NoneType 对象

问题

答案1

使用Python创建嵌套的字典或列表，根据提供的非缩进数据。

如何从字典列表中移除子集字典

如何在Keras中将多个fit调用重新分组到单个epoch中

基于数值分组转换pandas列值

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。