数据摄入 – 类型错误:无法解包非可迭代的 NoneType 对象

huangapple go评论59阅读模式
英文:

DATA INGESTION -TypeError: cannot unpack non-iterable NoneType object

问题

I am getting this error in data ingestion part (training pipeline). I am trying to run trainining_pipeline.py and this error shows up.

Full traceback:

Traceback (most recent call last):
  File "src\pipelines\training_pipeline.py", line 12, in <module>
    train_data_path, test_data_path = obj.initiate_data_ingestion()
TypeError: cannot unpack non-iterable NoneType object

data_ingestion.py:

import os
import sys
import pandas as pd
from src.logger import logging
from src.exception import CustomException
from src.components.data_ingestion import DataIngestion

if __name__=='__main__':
    obj = DataIngestion()
    train_data_path, test_data_path = obj.initiate_data_ingestion()
    print(train_data_path, test_data_path)

training_pipeline.py:

import os
import sys
from src.exception import CustomException
from src.logger import logging
import pandas as pd
from sklearn.model_selection import train_test_split
from dataclasses import dataclass

@dataclass
class DataIngestionconfig:
    train_data_path=os.path.join('artifacts','train.csv')
    test_data_path=os.path.join('artifacts','test.csv')
    raw_data_path=os.path.join('artifacts','raw.csv')

class DataIngestion:
    def __init__(self):
        self.ingestion_config=DataIngestionconfig()

    def initiate_data_ingestion(self):
        logging.info('Data Ingestion method starts')

        try:
            df=pd.read_csv(os.path.join('notebooks/data','gemstone.csv'))
            logging.info('Dataset read as pandas Dataframe')

            os.makedirs(os.path.dirname(self.ingestion_config.raw_data_path),exist_ok=True)

            df.to_csv(self.ingestion_config.raw_data_path,index=False)

            logging.info("Train test split")
            train_set,test_set = train_test_split(df,test_size=0.30,random_state=42)

            train_set.to_csv(self.ingestion_config.train_data_path,index=False,header=True)
            test_set.to_csv(self.ingestion_config.test_data_path,index=False,header=True)

            logging.info('Ingestion of data is completed')
            
            return(
                self.ingestion_config.train_data_path,
                self.ingestion_config.test_data_path
            ) 

        except Exception as e:
            logging.info('Error occured in Data Ingestion config')

if __name__=="__main__":
    obj=DataIngestion()
    train_data_path,test_data_path=obj.initiate_data_ingestion()

I tried returning the two values as a list but that didn't work as well.

英文:

I am getting this error in data ingestion part (training pipeline). I am trying to run trainining_pipeline.py and this error shows up.

Full traceback:

Traceback (most recent call last):
  File &quot;src\pipelines\training_pipeline.py&quot;, line 12, in &lt;module&gt;
    train_data_path,test_data_path = obj.initiate_data_ingestion()
TypeError: cannot unpack non-iterable NoneType object

data_ingestion.py:

import os
import sys
import pandas as pd
from src.logger import logging
from src.exception import CustomException
from src.components.data_ingestion import DataIngestion

if __name__==&#39;__main__&#39;:
    obj = DataIngestion()
    train_data_path,test_data_path = obj.initiate_data_ingestion()
    print(train_data_path,test_data_path)

training_pipeline.py:

import os
import sys
from src.exception import CustomException
from src.logger import logging
import pandas as pd
from sklearn.model_selection import train_test_split
from dataclasses import dataclass


## intialize the data ingestion configuration

@dataclass
class DataIngestionconfig:
    train_data_path=os.path.join(&#39;artifacts&#39;,&#39;train.csv&#39;)
    test_data_path=os.path.join(&#39;artifacts&#39;,&#39;test.csv&#39;)
    raw_data_path=os.path.join(&#39;artifacts&#39;,&#39;raw.csv&#39;)

## create a data ingestion class
class DataIngestion:
    def __init__(self):
        self.ingestion_config=DataIngestionconfig()

    def initiate_data_ingestion(self):
        logging.info(&#39;Data Ingestion method starts&#39;)

        try:
            df=pd.read_csv(os.path.join(&#39;notebooks/data&#39;,&#39;gemstone.csv&#39;))
            logging.info(&#39;Dataset read as pandas Dataframe&#39;)

            os.makedirs(os.path.dirname(self.ingestion_config.raw_data_path),exist_ok=True)

            df.to_csv(self.ingestion_config.raw_data_path,index=False)

            logging.info(&quot;Train test split&quot;)
            train_set,test_set = train_test_split(df,test_size=0.30,random_state=42)

            train_set.to_csv(self.ingestion_config.train_data_path,index=False,header=True)
            test_set.to_csv(self.ingestion_config.test_data_path,index=False,header=True)

            logging.info(&#39;Ingestion of data is completed&#39;)
            
            return(
                self.ingestion_config.train_data_path,
                self.ingestion_config.test_data_path
            ) 

        except Exception as e:
            logging.info(&#39;Error occured in Data Ingestion config&#39;)

if __name__==&quot;__main__&quot;:
    obj=DataIngestion()
    train_data_path,test_data_path=obj.initiate_data_ingestion()

I tried returning the two values as a list but that did'nt work as well.

答案1

得分: 1

initiate_data_ingestion 在没有异常时返回一个 tuple,但如果有异常则返回 Nonereturn 语句在 try 块内,而在 except 块内没有 return 语句,在 Python 中这等同于 return None)。同时,函数调用方始终期望输出一个元组。这是错误的根本原因。

正确处理异常的方式是确保函数始终返回正确的类型(在本例中是一个元组),而不管是否发生异常。

一般来说,你不应该使用一个捕获所有异常的 try...except 块,因为这会导致问题,就像这个问题一样:你捕获了一个你不知道是什么的 Exception,所以你的程序不知道如何处理它,这会导致更多的错误。

你的解决方法是删除 try...except 块,看看真正的错误是什么,然后重写代码,只捕获你已经预期并知道如何处理的异常。

英文:

initiate_data_ingestion returns a tuple when there is no Exception, but None if there is (the return statement is inside the try block, while there is no return statement in the except block, in Python it's the same as to return None). Meanwhile the function caller always expects a tuple as an output. That's the source of the error.

The correct way to handle exceptions is to make sure the function always returns the correct type (in this case, a tuple) regardless of whether an exception occurs.

In general you should never do a catch-all try...except block, because it leads to problems like this: you catch an Exception that you don't know what it is, so your program doesn't know how to handle it, and and it leads to more errors down the line.

Your solutions is to remove the try...except block to see what the real error is and fix that. Then rewrite the code to only catch exceptions that you already expect and know how to handle.

huangapple
  • 本文由 发表于 2023年6月11日 23:22:03
  • 转载请务必保留本文链接:https://go.coder-hub.com/76451154.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定