英文:
ConcurrentDictionary missing values?
问题
我有以下代码
ConcurrentDictionary<string, XElement> DocumentsElement = new ConcurrentDictionary<string, XElement>();
public void AddDocument(string docId)
{
    bool added;
    try
    {
        var documentElement = new XElement("Document", new XAttribute("DocID", docId));
        lock (lockObject)
        {
            added = DocumentsElement.TryAdd(docId, documentElement);
            string path = $"{_batchNumber}";
            if(!Directory.Exists(path)) { Directory.CreateDirectory(path); }
            File.AppendAllText($"{path}/{docId}.txt", documentElement.ToString());
        }
    }
    catch (Exception ex)
    {
        added = false;
    }
    if (added)
    {
        Debug.WriteLine("Document added successfully.");
    }
    else
    {
        Debug.WriteLine("Failed to add document.");
    }
    lock (lockObject)
    {
        Debug.WriteLine("From AddDocument method " + DocumentsElement.Count);
    }
}
问题是在整个过程结束时,DocumentsElement 缺少一些值,然而 "File.AppendAllText" 正确地将 XElement 值写入文本文件,所有文件都存在。有多个线程访问此 AddDocument 方法。
例如,DocumentsElement.Count = 4495
文本文件数量 = 4500(正确数量)。
我是否忽略了一些同步问题?
调用代码在我的主应用程序中,Process(ConcurrentDictionary<long, byte> idList)
private ConcurrentDictionary<string, DocumentLoadFile> loadfiles;
private async Task Process(ConcurrentDictionary<long, byte> idList)
{
     var tasks = idList.Select(async (id) =>
     {
          var metadata = FetchMetadataForId(id);
          var loadfile = loadfiles[metadata.BatchId];
          
          loadfile.AddDocument(metadata.DocumentId);
          if(metadata.HasAttachment)
              ProcessAttachment(metadata.AttachmentId)
     });
     await Task.WhenAll(tasks);
     //dostuff
}
private void ProcessAttachment(long attId)
{
          var metadata = FetchMetadataForId(attId);
          var loadfile = loadfiles[metadata.BatchId];
          
          loadfile.AddDocument(metadata.DocumentId);
}
英文:
I have the following code
ConcurrentDictionary<string, XElement> DocumentsElement = new ConcurrentDictionary<string, XElement>();
        public void AddDocument(string docId)
        {
            bool added;
            try
            {
                var documentElement = new XElement("Document", new XAttribute("DocID", docId));
                lock (lockObject)
                {
                    added = DocumentsElement.TryAdd(docId, documentElement);
                    string path = $"{_batchNumber}";
                    if(!Directory.Exists(path)) { Directory.CreateDirectory(path); }
                    File.AppendAllText($"{path}/{docId}.txt", documentElement.ToString());
                }
            }
            catch (Exception ex)
            {
                added = false;
            }
            if (added)
            {
                Debug.WriteLine("Document added successfully.");
            }
            else
            {
                Debug.WriteLine("Failed to add document.");
            }
            lock (lockObject)
            {
                Debug.WriteLine("From AddDocument method " + DocumentsElement.Count);
            }
        }
The issue is that at the end of the entire process, DocumentsElement is missing some values., however the "File.AppendAllText" is correctly writing the XElement value into a text file and all files are there. There are multiple threads that access this AddDocument method.
e.g. DocumentsElement.Count = 4495
No. Text files = 4500 (correct count).
Is there some synchronization issue that I'm missing?
The calling code is in my main app, Process(ConcurrentDictionary<long, byte> idList)
private ConcurrentDictionary<string, DocumentLoadFile> loadfiles;
private async Task Process(ConcurrentDictionary<long, byte> idList)
{
     var tasks = idList.Select(async (id) =>
     {
          var metadata = FetchMetadataForId(id);
          var loadfile = loadfiles[metadata.BatchId];
          
          loadfile.AddDocument(metadata.DocumentId);
          if(metadata.HasAttachment)
              ProcessAttachment(metadata.AttachmentId)
     });
     await Task.WhenAll(tasks);
     //dostuff
}
private void ProcessAttachment(long attId)
{
          var metadata = FetchMetadataForId(attId);
          var loadfile = loadfiles[metadata.BatchId];
          
          loadfile.AddDocument(metadata.DocumentId);
}
</details>
# 答案1
**得分**: 2
你正在向文件 "unconditionally" 添加内容,而字典会跳过重复项。根据 [`ConcurrentDictionary<TKey, TValue>.TryAdd(TKey, TValue)`][1] 文档:
> 尝试将指定的键和值添加到 `ConcurrentDictionary<TKey, TValue>` 中。
> 如果成功将键/值对添加到 `ConcurrentDictionary<TKey, TValue>`,则返回 `true`;如果键已经存在,则返回 `false`。
只有在字典中不存在该项时才尝试添加到文件中:
```csharp
var documentElement = new XElement("Document", new XAttribute("DocID", docId));
added = DocumentsElement.TryAdd(docId, documentElement);
if (added)
{
    lock (lockObject)
    {
        string path = $"{_batchNumber}";
        if (!Directory.Exists(path)) { Directory.CreateDirectory(path); }
        File.AppendAllText($"{path}/{docId}.txt", documentElement.ToString());
    }
}
英文:
You are adding to the file "unconditionally", while dictionary will skip duplicates. From the ConcurrentDictionary<TKey,TValue>.TryAdd(TKey, TValue) docs:
> Attempts to add the specified key and value to the ConcurrentDictionary<TKey,TValue>.
> Returns true if the key/value pair was added to the ConcurrentDictionary<TKey,TValue> successfully; false if the key already exists.
Try adding to the file only in case if the item is not present in the dictionary:
var documentElement = new XElement("Document", new XAttribute("DocID", docId));
added = DocumentsElement.TryAdd(docId, documentElement);
if (added)
{
    lock (lockObject)
    {
        string path = $"{_batchNumber}";
        if(!Directory.Exists(path)) { Directory.CreateDirectory(path); }
        File.AppendAllText($"{path}/{docId}.txt", documentElement.ToString());
    }
}
答案2
得分: 1
首先建议您进行以下更改(并理解),这将使代码更加简洁,并在存在重复项时提供通知:
```csharp
public void AddDocument(string docId)
{
    try
    {
        lock (lockObject)
        {
            var documentElement = new XElement("Document", new XAttribute("DocID", docId));
            if (!DocumentsElement.TryAdd(docId, documentElement))
                Debug.WriteLine("重复项:{0}", docId);
            // 文件I/O较慢。您可以考虑将其外包给队列并并行执行。
            string path = $"{_batchNumber}";
            if(!Directory.Exists(path)) { Directory.CreateDirectory(path); }
            File.AppendAllText($"{path}/{docId}.txt", documentElement.ToString());
            Debug.WriteLine("成功添加文档。");
            Debug.WriteLine("来自AddDocument方法的文档计数:" + DocumentsElement.Count);
        }
    }
    catch (Exception ex)
    {
         Debug.WriteLine("添加文档失败。");
    }
}
英文:
For a first approach, I'd recommend to make (and understand) these changes, which makes this a little smaller and tells you when there are dupes:
public void AddDocument(string docId)
{
    try
    {
        lock (lockObject)
        {
            var documentElement = new XElement("Document", new XAttribute("DocID", docId));
            if (!DocumentsElement.TryAdd(docId, documentElement))
                Debug.WriteLine("Duplicate : {0}", docId);
            // File I/O is slow. You may consider outsourcing this
            // to a queue and execute in parallel.
            string path = $"{_batchNumber}";
            if(!Directory.Exists(path)) { Directory.CreateDirectory(path); }
            File.AppendAllText($"{path}/{docId}.txt", documentElement.ToString());
            Debug.WriteLine("Document added successfully.");
            Debug.WriteLine("From AddDocument method " + DocumentsElement.Count);
        }
    }
    catch (Exception ex)
    {
         Debug.WriteLine("Failed to add document.");
    }
}
</details>
				通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论