问题

我需要将一个PDF文件转换成字节数组并保存到一个文本文件中。然后，我需要读取文本文件并恢复PDF文件。

为此，我正在使用以下代码（我正在使用UTF8编码来写文本文件）。

using System.Text;

string SourceFile = @"C:\Files\originalfile.pdf";
string SerializedFile = @"C:\Files\serialized.txt";
string RevivedFile = @"C:\Files\revived.pdf";

SerializeFile(SourceFile, SerializedFile);
DeSerializeFile(SerializedFile, RevivedFile);

// Serialize
static void SerializeFile(string SourceFilePath, string SerializedFilePath)
{
    byte[] SourceBytes = File.ReadAllBytes(SourceFilePath);
    File.WriteAllText(SerializedFilePath, Encoding.UTF8.GetString(SourceBytes));
    Console.WriteLine("Serialized File created");
}

// De-Serialize
static void DeSerializeFile(string SerializedFilePath, string RevivedFilePath)
{
    byte[] RevivedBytes;
    using (var sr = new StreamReader(SerializedFilePath))
    {
        RevivedBytes = Encoding.UTF8.GetBytes(sr.ReadToEnd());
    }
    File.WriteAllBytes(RevivedFilePath, RevivedBytes);
    Console.WriteLine("File Revived.");
}

序列化文件成功生成，但第二部分（恢复文件）出现了损坏。

如何正确恢复PDF文件？

英文:

I need to convert a PDF file into byte array and save it in a text file.
Then I need to read the text file and revive the PDF file.

For this I am using the below code (I am using UTF8 for writing the text file).

using System.Text;

string SourceFile = @&quot;C:\\Files\\originalfile.pdf&quot;;
string SerializedFile = @&quot;C:\\Files\\serialized.txt&quot;;
string RevivedFile = @&quot;C:\\Files\\revived.pdf&quot;;

SerializeFile(SourceFile, SerializedFile);
DeSerializeFile(SerializedFile, RevivedFile);

//Serialize
static void SerializeFile(string SourceFilePath, string SerializedFilePath)
{
    byte[] SourceBytes = File.ReadAllBytes(SourceFilePath);
    File.WriteAllText(SerializedFilePath, Encoding.UTF8.GetString(SourceBytes));
    Console.WriteLine(&quot;Serialized File created&quot;);
}


//De-Serialize
static void DeSerializeFile(string SerializedFilePath, string RevivedFilePath)
{       
    byte[] RevivedBytes;
    using (var sr = new StreamReader(SerializedFilePath))
    {
        RevivedBytes = Encoding.UTF8.GetBytes(sr.ReadToEnd());
    }
    File.WriteAllBytes(RevivedFilePath, RevivedBytes);
    Console.WriteLine(&quot;File Revived.&quot;);
}

The serialized file is generating successfully, but the second part (Revived File) is getting corrupted.

How can I correctly restore the PDF file ?

答案1

得分: 3

答案很简单：不要将不透明的二进制数据视为文本。

在SerializeFile方法中使用File.WriteAllBytes而不是File.WriteAllText，在DeSerializeFile中只使用File.ReadAllBytes。

你的代码目前假设可以将所有数据视为UTF-8编码的文本，在字节和字符串之间进行转换而不会丢失信息。这实际上并非如此，因为并非每个字节序列都是有效的UTF-8数据。

不清楚你为什么要将数据转换为文本，考虑到你实际上只是在复制文件。

如果出于某种原因你 实际上 需要将不透明数据表示为文本，应该使用base64或类似的机制，这是一个可逆的转换：字节 -> base64 -> 字节将始终保留所有数据，无论数据是什么。（假设实现正确等，当然。）

英文:

The answer is simple: don't treat opaque binary data as text.

Use File.WriteAllBytes instead of File.WriteAllText in your SerializeFile method, and just use File.ReadAllBytes in DeSerializeFile.

Your code currently assumes that you can treat all data as UTF-8-encoded text, converting between bytes and strings with no loss of information. That's simply not the case, because not every byte sequence is valid UTF-8 data.

It's unclear why you even want to convert the data to text, given that you're really just copying files.

If for some reason you actually need to represent opaque data as text, you should use base64 or some similar mechanism, which is a reversible transform: bytes -> base64 -> bytes will always preserve all the data, regardless of what that data is. (Assuming a correct implementation etc, of course.)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

从文本文件读取字节数组并保存为PDF会损坏PDF。

问题

答案1

IXmlSerializable 在反序列化过程中忽略其他属性

JSON DataContract 双类型，用于字符串和字符串数组。

安装 .NET Standard 2.1

.NET MAUI HorizontalStackLayout 触发器未触发。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论