英文:
Azure Function triggered by Queue times out regularly
问题
I have a linux Azure Function that is triggered by a Queue. Most messages that the Function attempts to process end up in the 'poison' queue and the only logs I can see indicate that it timed out.
2023-06-12 20:00:14.132  Executing 'Functions.Drone' (Reason='New queue message detected on 'drone-input-queue'.', Id=6e66d069-0b09-4966-92dd-d63a8f9aa3fc)  Information
2023-06-12 20:00:14.133  Trigger Details: MessageId: 3c54e04a-3d96-4ab9-ab69-3101f77a16f1, DequeueCount: 1, InsertionTime: 2023-06-12T20:00:10.000+00:00  Information
2023-06-12 20:10:14.133  Timeout value of 00:10:00 exceeded by function 'Functions.Drone' (Id: '6e66d069-0b09-4966-92dd-d63a8f9aa3fc'). Initiating cancellation.  Error
2023-06-12 20:10:14.133  Executed '{functionName}' ({status}, Id={invocationId}, Duration={executionDuration}ms)  Error
2023-06-12 20:10:14.133  Executed 'Functions.Drone' (Failed, Id=6e66d069-0b09-4966-92dd-d63a8f9aa3fc, Duration=600000ms)  Error
2023-06-12 20:10:14.133  Error
Potentially relevant information:
- The Function App is using a Consumption plan. I know there can be cold starts, but I see this happen, like, 90% of the time, even when messages are coming in every few minutes.
 - The Function App and Queue are set up using an ARM template. The Function is later deployed from an archive.
 - The Function is using Linux as the OS type and is running Java code.
 - I do occasionally see proper execution, so 'AzureWebJobsStorage', 'STORAGE_QUEUE_CONNECTION_STRING', and 'WEBSITE_CONTENTAZUREFILECONNECTIONSTRING' should all be set correctly.
 - I've double and triple checked the host.json and function.json, but here they are, for reference:
host.json 
{
  "version": "2.0",
  "functionTimeout": "00:10:00",
  "extensionBundle": {
    "id": "Microsoft.Azure.Functions.ExtensionBundle",
    "version": "[2.*, 3.0.0)"
  },
  "extensions": {
    "queues": {
      "maxPollingInterval": "00:00:02",
      "visibilityTimeout" : "00:00:30",
      "batchSize": 16,
      "maxDequeueCount": 3,
      "newBatchThreshold": 8
    }
  }
}
function.json
{
  "scriptFile" : "../my.package.drone.jar",
  "entryPoint" : "my.package.Drone.run",
  "bindings": [
    {
      "name": "message",
      "type": "queueTrigger",
      "direction": "in",
      "queueName": "catapult-drone-input-queue"
    },
    {
      "name": "output",
      "type": "queue",
      "direction": "out",
      "queueName": "catapult-drone-output-queue"
    }
  ]
}
I was initially deploying the Function with an outdated version of the CLI. While updating that fixed some issued, it did not fix this one. Here is the version info.
# az version
{
  "azure-cli": "2.39.0",
  "azure-cli-core": "2.39.0",
  "azure-cli-telemetry": "1.0.6",
  "extensions": {}
}
# func
...
Azure Functions Core Tools
Core Tools Version:       3.0.4899 Commit hash: N/A  (64-bit)
Function Runtime Version: 3.17.0.0
...
I suspect it might have something to do with the scaling configuration, but am at a loss for how to configure that. Any advice would be appreciated.
Here is a snippet of the entrypoint code:
@FunctionName("Drone")
public void run(@QueueTrigger(name = "message", queueName = "drone-input-queue") String message, 
                @QueueOutput(name = "output", queueName = "drone-output-queue") OutputBinding<String> output,
                final ExecutionContext context) {
    String id = UUID.randomUUID().toString();
    logger.setContext(context); //custom tool for logging to context logger and/or blob
    logger.info("[" + id + "] Start");
    encryption = initializeInFlightEncryptionClient(logger);
    logger.info("[" + id + "] encryption initialized");
    jsonSigner = initializeJsonSigningUtility(logger);
    logger.info("[" + id + "] jsonSigner initialized");
    try {
        // ... logic to handle message ...
    }
    catch(Exception e) {
        logger.severe("ERROR: " + e.getMessage(), e);
    }
    finally {
        logger.closeLogFile();
    }
}
I forgot to mention that I have fairly extensive logging in this part of the code, so I know that, at the very least, it isn't getting past 'logger.setContext', which only sets a private variable to context.getLogger(). That's why I didn't bother sharing code earlier.
Also, I have seen messages that take only 30 seconds to process cause this timeout. If I send like 10 of them at once, the first may get processed correctly and quickly, but the next 9 tend to all have this timeout. That's why I suspect something wonky with the configuration.
P.S. This is the ONLY Function in the Function App.
英文:
I have a linux Azure Function that is triggered by a Queue. Most messages that the Function attempts to process end up in the 'poison' queue and the only logs I can see indicate that it timed out.
2023-06-12 20:00:14.132  Executing 'Functions.Drone' (Reason='New queue message detected on 'drone-input-queue'.', Id=6e66d069-0b09-4966-92dd-d63a8f9aa3fc)  Information
2023-06-12 20:00:14.133  Trigger Details: MessageId: 3c54e04a-3d96-4ab9-ab69-3101f77a16f1, DequeueCount: 1, InsertionTime: 2023-06-12T20:00:10.000+00:00  Information
2023-06-12 20:10:14.133  Timeout value of 00:10:00 exceeded by function 'Functions.Drone' (Id: '6e66d069-0b09-4966-92dd-d63a8f9aa3fc'). Initiating cancellation.  Error
2023-06-12 20:10:14.133  Executed '{functionName}' ({status}, Id={invocationId}, Duration={executionDuration}ms)  Error
2023-06-12 20:10:14.133  Executed 'Functions.Drone' (Failed, Id=6e66d069-0b09-4966-92dd-d63a8f9aa3fc, Duration=600000ms)  Error
2023-06-12 20:10:14.133  Error
Potentially relevant information:
- The Function App is using a Consumption plan. I know there can be cold starts, but I see this happen, like, 90% of the time, even when messages are coming in every few minutes.
 - The Function App and Queue are set up using an ARM template. The Function is later deployed from an archive.
 - The Function is using Linux as the OS type and is running Java code.
 - I do occasionally see proper execution, so 'AzureWebJobsStorage', 'STORAGE_QUEUE_CONNECTION_STRING', and 'WEBSITE_CONTENTAZUREFILECONNECTIONSTRING' should all be set correctly.
 - I've double and triple checked the host.json and function.json, but here they are, for reference:
host.json 
{
  "version": "2.0",
  "functionTimeout": "00:10:00",
  "extensionBundle": {
    "id": "Microsoft.Azure.Functions.ExtensionBundle",
    "version": "[2.*, 3.0.0)"
  },
  "extensions": {
    "queues": {
      "maxPollingInterval": "00:00:02",
      "visibilityTimeout" : "00:00:30",
      "batchSize": 16,
      "maxDequeueCount": 3,
      "newBatchThreshold": 8
    }
  }
}
function.json
{
  "scriptFile" : "../my.package.drone.jar",
  "entryPoint" : "my.package.Drone.run",
  "bindings": [
    {
      "name": "message",
      "type": "queueTrigger",
      "direction": "in",
      "queueName": "catapult-drone-input-queue"
    },
    {
      "name": "output",
      "type": "queue",
      "direction": "out",
      "queueName": "catapult-drone-output-queue"
    }
  ]
}
- I was initially deploying the Function with an outdated version of the CLI. While updating that fixed some issued, it did not fix this one. Here is the version info.
 
# az version
{
  "azure-cli": "2.39.0",
  "azure-cli-core": "2.39.0",
  "azure-cli-telemetry": "1.0.6",
  "extensions": {}
}
# func
...
Azure Functions Core Tools
Core Tools Version:       3.0.4899 Commit hash: N/A  (64-bit)
Function Runtime Version: 3.17.0.0
...
I suspect it might have something to do with the scaling configuration, but am at a loss for how to configure that. Any advice would be appreciated.
Here is a snippet of the entrypoint code:
@FunctionName("Drone")
public void run(@QueueTrigger(name = "message", queueName = "drone-input-queue") String message, 
                @QueueOutput(name = "output", queueName = "drone-output-queue") OutputBinding<String> output,
                final ExecutionContext context) {
    String id = UUID.randomUUID().toString();
    logger.setContext(context); //custom tool for logging to context logger and/or blob
    logger.info("["+id+"] Start");
    encryption = initializeInFlightEncryptionClient(logger);
    logger.info("["+id+"] encryption initialized");
    jsonSigner = initializeJsonSigningUtility(logger);
    logger.info("["+id+"] jsonSigner initialized");
    try {
        // ... logic to handle message ...
    }
    catch(Exception e) {
        logger.severe("ERROR: "+e.getMessage(), e);
    }
    finally {
        logger.closeLogFile();
    }
}
I forgot to mention that I have fairly extensive logging in this part of the code, so I know that, at the very least, it isn't getting past 'logger.setContext', which only sets a private variable to context.getLogger(). That's why I didn't bother sharing code earlier.
Also, I have seen messsages that take only 30 seconds to process cause this timeout. If I send like 10 of them at once, the first may get processed correctly and quickly, but the next 9 tend to all have this timeout. That's why I suspect something wonky with the configuration.
P.S. This is the ONLY Function in the Function App.
答案1
得分: 0
你的函数执行时间超过了10分钟。在本地运行时,你可以将超时时间设置为更大的值。
如果执行时间太长,你有以下选择:
- 应用计划(无限制)
 - 将其拆分成较小的部分
 - 使用可靠函数
 
英文:
Your function is talking longer than 10 minutes to complete. Ho wlong does it talk to complete locally? When run locally, you can set the timeout to a larger value.
If it just takes too long, your options are:
- App plan (no limits)
 - Break it up into smaller pieces
 - Use durable functions
 
答案2
得分: 0
经过大量研究,并根据评论和其他答案提供的一些线索,我发现将FUNCTIONS_WORKER_PROCESS_COUNT = 2设置在应用设置中,几乎完全消除了这个问题 - 到了一个不再是问题的程度。无论这是否是一个好的解决方案,我不知道,但这是我的目前结果。
英文:
After much research and following some of the leads provided by the comments and other answers, I found setting FUNCTIONS_WORKER_PROCESS_COUNT = 2 in the App Settings resulted in the issue almost completely disappearing - to the point it wasn't an issue anymore.  Whether this is a good solution or not, I don't know, but that is my results thus far.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论