英文:
Remove a column from CSV file in Notepad++
问题
首先,我是Notepad++的新手。我尝试在Notepad++中编辑存储为CSV格式的数据集。由于某些单元格包含超过15位数字,无法在Excel中打开此文件,因为Excel会将这些数字转换为科学计数法。
数据集的列标题如下,
BNFUniqueID,username,CollectTeam,Hos_methodID,CollectData,hhstatus,hhupazila,hhunion,hhlocationsitetype,hhsitename,hhlocalblock,villagename,hhlandmark,hhmajiname,hhmajitel,HoHname,Hhsize,HoHcontact,cardtype,cardnum,olduniqueID,BNFname,BNFage,BNFsex,BNFagegroup
在这个数据集中,有一列(hhlandmark
)在中间,我正在尝试删除整列。其中一个问题是,并非所有单元格都包含数据,所以使用<kbd>ALT</kbd> + <kbd>SHIFT</kbd> + <kbd>↓</kbd>不适合此任务,因为垂直选择也会选择其他列的单元格。
我正在寻找一种避免这个问题并仅删除我想要删除的列的方法。
英文:
First of all, I'm a newbie in Notepad++. I'm trying to edit this dataset in Notepad++, which is stored in CSV. I can't open this file in Excel as there are some cells containing digits longer than 15 and Excel will convert these digits.
The column headings are like this,
BNFUniqueID,username,CollectTeam,Hos_methodID,CollectData,hhstatus,hhupazila,hhunion,hhlocationsitetype,hhsitename,hhlocalblock,villagename,hhlandmark,hhmajiname,hhmajitel,HoHname,Hhsize,HoHcontact,cardtype,cardnum,olduniqueID,BNFname,BNFage,BNFsex,BNFagegroup
In this dataset there is a column (hhlandmark
) in the middle and I'm trying to delete this whole column. One of the problem is that, not all the cells contain data, so the <kbd>ALT</kbd> + <kbd>SHIFT</kbd> + <kbd>↓</kbd> isn't suitable for this task, as vertical selection would block cells from other columns as well.
I'm looking for a way to avoid this and delete only the column I want to delete.
答案1
得分: 2
以下是已翻译的内容:
有12列在hhlandmark
列之前,所以我们可以尝试以下的正则表达式查找和替换:
查找: ^((?:[^,],){12})[^,],(.*)$
替换: $1$2
这个模式用于匹配:
^
从行的开头((?:[^,]*,){12})
匹配并捕获前12列到$1[^,]*,
然后匹配第13列(hhlandmark)(.*)
匹配并捕获行的其余部分到$2
$
行的结尾
然后我们用$1$2进行替换,从而有效地删掉第13列hhlandmark
。
这里有一个正在运行的正则表达式演示:demo。
英文:
There are 12 columns before the hhlandmark
column, so we can try the following find and replace in regex mode:
Find: ^((?:[^,]*,){12})[^,]*,(.*)$
Replace: $1$2
This pattern says to match:
^
from the start of the line((?:[^,]*,){12})
match and capture in$1
the first 12 columns[^,]*,
then match the 13th column (hhlandmark)(.*)
match and capture in the$2
the rest of the line
$
end of the line
We then replace with $1$2
to effectively splice out the 13th hhlandmark
column.
Here is a running regex demo.
答案2
得分: 0
但你可以很容易地使用Python的pandas库来做到这一点。
英文:
I dont know about notepad++, but you could do this easily using pandas library of python.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论