Attempting to read parts of a file at a time modifying it and saving it, then reversing the changes and saving it again. File currupts doing it

huangapple go评论74阅读模式
英文:

Attempting to read parts of a file at a time modifying it and saving it, then reversing the changes and saving it again. File currupts doing it

问题

我正在尝试读取一个大文件的一部分(以字节块而不是字符串或字符的形式),对其进行加密并保存到一个临时文件,直到整个文件都被读取,临时文件将覆盖原始文件。然后,在加密过程之后,我会进行解密,除了解密之外,数据都是相同的。

问题是,经过加密和解密后,文件的大小不同,并且文件不包含与开始时相同的数据。

是的,我曾尝试使用"FILE.ReadAllBytes(file)"来读取整个文件,这样可以正常工作,但由于可能需要处理大文件,例如1-2 GB,我选择了读取块以避免使用大量资源。缓冲区大小可以在16到128 MB之间。

以下是我已有的代码:

加密过程:

byte[] Data = new byte[bufferSize];
using (FileStream _temporaryFile_filestream = new FileStream(file + ".tmp", FileMode.Create, FileAccess.Write, FileShare.None))
{
  FileStream _originalFile_filestream = new FileStream(file, FileMode.Open, FileAccess.Read, FileShare.None);
  int bytesRead = -1;

  while ((bytesRead = _originalFile_filestream.Read(Data, 0, bufferSize)) > 0){

    for (int i = 0; i < Data.Length; i++) {
      Data[i] += 1;
    }
  }

  // 关闭文件。
  _originalFile_filestream.Close();
  _temporaryFile_filestream.Close();
}

File.Delete(file);  // 删除原始文件

// 将加密后的数据重写回原始文件名。
bufferSize = (640000 * 1024);   // 64兆字节缓冲区大小。
using (FileStream _originalFile_filestream = new FileStream(file, FileMode.Create, FileAccess.Write, FileShare.None)) {
  FileStream _temporaryFile_filestream = new FileStream(file + ".tmp", FileMode.Open, FileAccess.Read, FileShare.None);
  _originalFile_filestream.SetLength(_temporaryFile_filestream.Length);
  int bytesRead = -1;
  byte[] _data = new byte[bufferSize];

  while ((bytesRead = _temporaryFile_filestream.Read(_data, 0, bufferSize)) > 0) {
    _originalFile_filestream.Write(_data, 0, bytesRead);
  }
  // 关闭文件。
  _originalFile_filestream.Close();
  _temporaryFile_filestream.Close();
}

File.Delete(file + ".tmp"); // 删除临时文件。

解密过程:

byte[] Data = new byte[bufferSize];
using (FileStream _temporaryFile_filestream = new FileStream(file + ".tmp", FileMode.Create, FileAccess.Write, FileShare.None))
{
  FileStream _originalFile_filestream = new FileStream(file, FileMode.Open, FileAccess.Read, FileShare.None);
  int bytesRead = -1;

  while ((bytesRead = _originalFile_filestream.Read(Data, 0, bufferSize)) > 0){

    for (int i = 0; i < Data.Length; i++) {
      Data[i] -= 1;
    }
  }

  // 关闭文件。
  _originalFile_filestream.Close();
  _temporaryFile_filestream.Close();
}

File.Delete(file);  // 删除原始文件

// 将解密后的数据重写回原始文件名。
bufferSize = (640000 * 1024);   // 64兆字节缓冲区大小。
using (FileStream _originalFile_filestream = new FileStream(file, FileMode.Create, FileAccess.Write, FileShare.None)) {
  FileStream _temporaryFile_filestream = new FileStream(file + ".tmp", FileMode.Open, FileAccess.Read, FileShare.None);
  _originalFile_filestream.SetLength(_temporaryFile_filestream.Length);
  int bytesRead = -1;
  byte[] _data = new byte[bufferSize];

  while ((bytesRead = _temporaryFile_filestream.Read(_data, 0, bufferSize)) > 0) {
    _originalFile_filestream.Write(_data, 0, bytesRead);
  }
  // 关闭文件。
  _originalFile_filestream.Close();
  _temporaryFile_filestream.Close();
}

File.Delete(file + ".tmp"); // 删除临时文件。

希望这有助于解决问题。如果您有其他问题,请随时提出。

英文:

I am attempting to read a chunk of a large file (as blocks of bytes, not strings not chars, but bytes), encrypt it and saving it to a temporary file, until the whole file is read, where the temporary file will overwrite the original file. Then After the encryption process i'd decrypt it doing the same thing except decrypting than encrypting the data.

The encryption is simply sum each byte from the data with 1, decryption is subtracting each byte from the data with 1.

The Problem:

the problem i am facing is the file is not the same size after encrypting and decrypting it, and the file does not contain the same data as it began with.

Yes i have attempted to read the whole file using "FILE.ReadAllBytes(file)" and that worked, but due to situations where it could be possible to work with large files as 1-2 gb, i am resorting to reading chunks instead to avoid using the large amounts of ressourcess. Buffer size to be anything between 16 and 128 mb.

The code that i have is this:

Encryption process:

byte[] Data = new byte[bufferSize];
using (FileStream _temporaryFile_filestream = new FileStream(file + &quot;.tmp&quot;, FileMode.Create, FileAccess.Write, FileShare.None))
{
  FileStream _originalFile_filestream = new FileStream(file, FileMode.Open, FileAccess.Read, FileShare.None);
  int bytesRead = -1;

  while ((bytesRead = _originalFile_filestream.Read(Data, 0, bufferSize)) &gt; 0){

    for (int i = 0; i &lt; Data.Length; i++) {
      Data[i] += 1;
    }
  }

// Close files.
  _originalFile_filestream.Close();
  _temporaryFile_filestream.Close();
}

File.Delete(file);  //Removing original file

// Rewriting the encrypted data back to the original filename.
bufferSize = (640000 * 1024);   // 64 megabyte buffer size.
using (FileStream _originalFile_filestream = new FileStream(file, FileMode.Create, FileAccess.Write, FileShare.None)) {
  FileStream _temporaryFile_filestream = new FileStream(file + &quot;.tmp&quot;, FileMode.Open, FileAccess.Read, FileShare.None);
  _originalFile_filestream.SetLength(_temporaryFile_filestream.Length);
  int bytesRead = -1;
  byte[] _data = new byte[bufferSize];

  while ((bytesRead = _temporaryFile_filestream.Read(_data, 0, bufferSize)) &gt; 0) {
    _originalFile_filestream.Write(_data, 0, bytesRead);
  }
  // Close files.
  _originalFile_filestream.Close();
  _temporaryFile_filestream.Close();
}

File.Delete(file + &quot;.tmp&quot;); // Delete temporary file.

Decryption process:

byte[] Data = new byte[bufferSize];
using (FileStream _temporaryFile_filestream = new FileStream(file + &quot;.tmp&quot;, FileMode.Create, FileAccess.Write, FileShare.None))
{
  FileStream _originalFile_filestream = new FileStream(file, FileMode.Open, FileAccess.Read, FileShare.None);
  int bytesRead = -1;

  while ((bytesRead = _originalFile_filestream.Read(Data, 0, bufferSize)) &gt; 0){

    for (int i = 0; i &lt; Data.Length; i++) {
      Data[i] -= 1;
    }
  }

// Close files.
  _originalFile_filestream.Close();
  _temporaryFile_filestream.Close();
}

File.Delete(file);  //Removing original file

// Rewriting the encrypted data back to the original filename.
bufferSize = (640000 * 1024);   // 64 megabyte buffer size.
using (FileStream _originalFile_filestream = new FileStream(file, FileMode.Create, FileAccess.Write, FileShare.None)) {
  FileStream _temporaryFile_filestream = new FileStream(file + &quot;.tmp&quot;, FileMode.Open, FileAccess.Read, FileShare.None);
  _originalFile_filestream.SetLength(_temporaryFile_filestream.Length);
  int bytesRead = -1;
  byte[] _data = new byte[bufferSize];

  while ((bytesRead = _temporaryFile_filestream.Read(_data, 0, bufferSize)) &gt; 0) {
    _originalFile_filestream.Write(_data, 0, bytesRead);
  }
  // Close files.
  _originalFile_filestream.Close();
  _temporaryFile_filestream.Close();
}

File.Delete(file + &quot;.tmp&quot;); // Delete temporary file.

答案1

得分: 0

欢迎来到编程 Anas。我们都必须从某个地方开始。让我们来看看你问题的主要原因,然后我将提供一些其他的评论。

这是你的第一个“加密”尝试:

byte[] Data = new byte[bufferSize];
using (FileStream _temporaryFile_filestream = new FileStream(file + &quot;.tmp&quot;, FileMode.Create, FileAccess.Write, FileShare.None))
{
  FileStream _originalFile_filestream = new FileStream(file, FileMode.Open, FileAccess.Read, FileShare.None);
  int bytesRead = -1;

  while ((bytesRead = _originalFile_filestream.Read(Data, 0, bufferSize)) &gt; 0){

    for (int i = 0; i &lt; Data.Length; i++) {
      Data[i] += 1;
    }
  }

// 关闭文件。
  _originalFile_filestream.Close();
  _temporaryFile_filestream.Close();
}

第一次尝试还不错,但你忽略了两个要点:

  1. 你从文件中获取了读取的字节数,但实际上你没有在你的“加密”循环中使用该值。

  2. 你实际上从未将修改后的字节(或任何内容)写入临时文件。

  3. 你真的应该对两个流都使用 using

  4. 有更简洁的方法来创建这些流。

让我们重写一下:

byte[] Data = new byte[bufferSize];
using (var original = File.OpenRead(file))
using (var temp = File.Create(file + &quot;.tmp&quot;))
{
	int readCount = 0;
	while ((readCount = _original.Read(Data, 0, bufferSize)) &gt; 0)
	{
		// 只处理我们实际读取的字节
		for (int i = 0; i &lt; readCount; i++)
			Data[i] += 1;
	}
	// 写回我们处理的字节数
	temp.Write(Data, 0, readCount);
}

注意,释放流(这由 using 语句自动处理)会自动刷新并关闭它,所以你不需要显式地这样做。

接下来的部分,你只是将临时文件的内容复制回原始文件... 但你其实可以重命名文件。所以让我们以最简单的方式完成你的“加密”方法:

File.Delete(file);
File.Move(file + &quot;.tmp&quot;, file);

如果你真的,真的想要以困难的方式复制,那么你可以使用 CopyTo 方法将一个流的内容写入另一个流。因此,我们也可以这样做:

// 请注意,“File.Create” 会清除文件中的任何现有内容(如果存在),所以我们不需要删除文件。
using (var original = File.Create(file))
using (var temp = File.OpenRead(file + &quot;.tmp&quot;))
{
	temp.CopyTo(original);
}

// 最后,删除临时文件
File.Delete(file + &quot;.tmp&quot;);

解码方法与编码方法相同,只需将 Data[i] -= 1; 用于还原我们在编码器中所做的更改,因此从这个方面来说,实现起来非常简单... 但现在我们正在两次编写相同的代码,这通常被认为是一个不好的想法(也称为代码异味)。所以让我们尝试一些更通用的东西。

与其两次编写转换代码,不如创建一个通用的文件转换方法。实际的转换将由我们作为参数提供的外部方法执行,因此这将只执行读取/写入/重命名部分:

// 默认情况下,64KB 块
const int DefaultBlockSize = 64 * 1024;

// 通过调用“transform”方法来转换文件。
public static void TransformFile(string fileName, Action<byte[], int> transform, int blockSize = DefaultBlockSize)
{
	var tempFileName = fileName + &quot;.tmp&quot;;
	
	// 读取块,将它们传递给 transform 方法,写入到 temp
	if (blockSize < 1024)
		blockSize = 1024;
	byte[] buffer = new byte[blockSize];
	
	using (var temp = File.Create(tempFileName))
	using (var source = File.OpenRead(fileName))
	{
		int readCount;
		while ((readCount = source.Read(buffer, 0, blockSize)) &gt; 0)
		{
			// 'transform' 改变提供的缓冲区
			transform(buffer, readCount);
			temp.Write(buffer, 0, readCount);
		}
	}
	
	File.Delete(fileName);
	File.Move(tempFileName, fileName);
}

这应该看起来很熟悉,因为它基本上与上面的代码相同,只是实际的转换代码被移到外部。现在,你可以通过不同的参数调用这个单一的代码,以获得不同的结果,读取、转换和写入数据的外壳部分对于编码和解码都是相同的。

对于你的用例,你可以这样编写“Encode”和“Decode”方法:

private static void EncodeTransform(byte[] buffer, int count)
{
	for (int i = 0; i &lt; count; i++)
		buffer[i] += 1;
}

private static void DecodeTransform(byte[] buffer, int count)
{
	for (int i = 0; i &lt; count; i++)
		buffer[i] -= 1;
}

public static void EncodeFile(string fileName)
{
	TransformFile(fileName, EncodeTransform);
}

public static void DecodeFile(string fileName)
{
	TransformFile(fileName, DecodeTransform);
}

从这里开始,你可以根据需要调整转换方法,只要它们是对称的:输出与输入相同的大小,没有文件数据的头部或其他添加等。

“TransformFile” 方法不关心你对数据做了什么,它只是读取文件的块,将它们发送到你的“transform”方法,然后将它们写回。你可以将其用于任何块样式的对称转换。

非对称转换稍微复杂一些,因为你可能需要处理从源中读取

英文:

Welcome to programming Anas. We all had to start somewhere. Let's look at the main culprit of your problem and then I'll offer some other critiques.

Here's your first 'encryption' pass:

byte[] Data = new byte[bufferSize];
using (FileStream _temporaryFile_filestream = new FileStream(file + &quot;.tmp&quot;, FileMode.Create, FileAccess.Write, FileShare.None))
{
  FileStream _originalFile_filestream = new FileStream(file, FileMode.Open, FileAccess.Read, FileShare.None);
  int bytesRead = -1;

  while ((bytesRead = _originalFile_filestream.Read(Data, 0, bufferSize)) &gt; 0){

    for (int i = 0; i &lt; Data.Length; i++) {
      Data[i] += 1;
    }
  }

// Close files.
  _originalFile_filestream.Close();
  _temporaryFile_filestream.Close();
}

Not a bad first attempt, but you've skipped two points:

  1. You get the number of bytes read from the file, but you don't actually use that value in the your 'encryption' loop.

  2. You never actually write the modified bytes (or anything) to the temporary file.

  3. You really should have a using for both streams.

  4. There are much shorter ways to write those stream creations.

Let's rewrite that:

byte[] Data = new byte[bufferSize];
using (var original = File.OpenRead(file))
using (var temp = File.Create(file + &quot;.tmp&quot;))
{
	int readCount = 0;
	while ((readCount = _original.Read(Data, 0, bufferSize)) &gt; 0)
	{
		// Only process the bytes we actually read
		for (int i = 0; i &lt; readCount; i++)
			Data[i] += 1;
	}
	// Write back the number of bytes we processed
	temp.Write(Data, 0, readCount);
}

Note that disposing of a stream (which is handled automatically by the using statement) flushes and closes it automatically, so you don't need to do that explicitly.

Now the next part you are simply copying the content of your temp file back to the original... but you could just rename the file. So let's finish up your 'encrypt' method in the simplest way:

File.Delete(file);
File.Move(file + &quot;.tmp&quot;, file);

If you really, really want to do copying the hard way then you can use the CopyTo method to write one stream's content into another. So we could have done this:

// Note that &#39;File.Create&#39; will clear any existing content from the file
// if it exists, so we don&#39;t need to delete the file.
using (var original = File.Create(file))
using (var temp = File.OpenRead(file + &quot;.tmp&quot;))
{
	temp.CopyTo(original);
}

// Finish up by removing the temporary file
File.Delete(file + &quot;.tmp&quot;);

The decode method is the same except for the Data[i] -= 1; to revert the change we made in the encoder, so that's simple enough to implement from this... but now we're writing the same code twice, and that's usually considered to be a Bad Idea (aka Code Smell). So let's try for something that we can use more generally.

Rather than writing the transformation code twice, let's create a single method that we can use for general file transformation. The actual transformation will be done by an external method that we supply as a parameter, so this will just do the read/write/rename side of things:

// 64KB blocks by default
const int DefaultBlockSize = 64 * 1024;

// Transform a file by calling the &#39;transform&#39; method.
public static void TransformFile(string fileName, Action&lt;byte[], int&gt; transform, int blockSize = DefaultBlockSize)
{
	var tempFileName = fileName + &quot;.tmp&quot;;
	
	// Read blocks, pass them to transform method, write to temp
	if (blockSize &lt; 1024)
		blockSize = 1024;
	byte[] buffer = new byte[blockSize];
	
	using (var temp = File.Create(tempFileName))
	using (var source = File.OpenRead(fileName))
	{
		int readCount;
		while ((readCount = source.Read(buffer, 0, blockSize)) &gt; 0)
		{
			// &#39;transform&#39; changes the supplied buffer
			transform(buffer, readCount);
			temp.Write(buffer, 0, readCount);
		}
	}
	
	File.Delete(fileName);
	File.Move(tempFileName, fileName);
}

This should look familiar since it's basically the same as the code above, just with the actual transformation code moved out. Now instead of writing the same code twice you can call this single piece of code with different parameters to get different results. The outer shell - the process of reading, transforming and writing data - is the same for both encoding and decoding.

For your use case you can then write the 'Encode' and 'Decode' methods like this:

private static void EncodeTransform(byte[] buffer, int count)
{
	for (int i = 0; i &lt; count; i++)
		buffer[i] += 1;
}

private static void DecodeTransform(byte[] buffer, int count)
{
	for (int i = 0; i &lt; count; i++)
		buffer[i] -= 1;
}

public static void EncodeFile(string fileName)
{
	TransformFile(fileName, EncodeTransform);
}

public static void DecodeFile(string fileName)
{
	TransformFile(fileName, DecodeTransform);
}

From there you can adjust the transform methods as you wish, as long as they're symmetrical: the same size output as input, no headers or other additions to the file data, etc.


The TransformFile method doesn't care what you're doing with that data, it just reads blocks of the file, sends them to your transform method, then writes them back. You can use that for any block-style symmetric transformation you like.

Asymmetric transforms are a bit more involved since you might need to deal with things like reading different-sized blocks from the source. Imagine that you're compressing each block to an inconsisten smaller size, but to decompress you need to pass each block individually to the decompressor so you need to know how big each block is. If you compress 64K down to 5K you don't want to write a bunch of empty data, so how do you go about tracking that?

And of course when you decompress the data you'll need a larger buffer to put the data into when you're decompressing. How big does that buffer need to be?

At this point we're getting into themes involving file formats: file headers, block headers, asymmetric block sizes and so on.

I'll leave you with this old article by James McCaffrey relating to custom Stream implementations to do something similar. I hope it gives you some ideas.

huangapple
  • 本文由 发表于 2023年6月12日 02:51:18
  • 转载请务必保留本文链接:https://go.coder-hub.com/76452032.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定