How to design a fitness function for a Genetic Algorithm that moves a car to a point through a city?

huangapple go评论48阅读模式
英文:

How to design a fitness function for a Genetic Algorithm that moves a car to a point through a city?

问题

我正在尝试设计一个遗传算法与神经网络结合,以使一辆汽车在Unity生成的城市中移动到一个随机目标位置。我已经设计了它,代码也可以运行,但是代理程序只学会避开障碍物,而不是移动到指定的目标位置。

目前我的适应度函数是:
((1 / (1 + totalDistanceTravelled)) + (1 / (1 + distanceToTarget)) + (1 / (1 + numberOfCollisions)) + sensors.Average())

我该如何修改这个函数以满足期望的结果?

如果需要更多信息,请告诉我。

英文:

I am trying to design a genetic algorithm with neural networks to move a car through a city generated in Unity to a randomised target destination. I have designed it and the code works but the agent only learns to avoid obstacles and not move to the specified target destination.

Currently my fitness function is:
((1 / (1 + totalDistanceTravelled)) + (1 / (1 + distanceToTarget)) + (1 / (1 + numberOfCollisions)) + sensors.Average())

How can I modify this to meet the desired outcome?

If more is required let me know.

答案1

得分: 2

设计一个良好的适应函数并不是一项微不足道的任务,事实上,这通常是创建成功的遗传算法的核心任务。你没有提供太多细节,但我会尽量提供一些有用的分析。

良好适应函数的一个要求是,它必须为基因空间中至少一些起始点提供通往目标的平滑路径。也就是说,如果你从基因空间的一个随机点开始,你能否主要通过适应函数作为指南逐渐迁移到基因空间中更好的地方。

看看你的适应函数,我倾向于认为最容易找到的解决方案就是不移动太多。让我们将适应函数分解为其组成部分:

  • 1/(1 + distance-travelled)
  • 1/(1 + distance-to-target)
  • 1/(1 + number-collisions)
  • sensor-average - 不确定这是什么?

假设 distance-to-target 是 10。前三个组成部分是 1 + 1/10 + 1。如果你朝目标移动一步而不与任何东西碰撞,前三个组成部分是 1/2 + 1/9 + 1,比以前要小。显然,这不是你希望得到的对于正确移动的反馈。

对于这个任务,我认为你希望 distance-to-target 主导适应度,即使你接近目标,不考虑其他因素,适应度也应该增加。

我脑海中没有一个理想的适应函数,但我会仔细考虑一下。

假设等效的 distance-to-targetdistance-travelled 应该是下一个最重要的因素。假设两个结果都达到 5distance-to-target,走过更短距离的那个应该得分更高。

number-collision 需要相对于 distance-travelled,否则,你可以通过不移动来最小化 number-collisions

这个想法是你需要仔细研究适应函数的细节,并确保它总是提供正确的反馈。这并不像听起来那么容易,因为大多数适应函数有几个变量。

如果有一种险恶的解决方案在适应函数中找到了局部最大值,而实际上并没有解决问题,算法很可能会找到它。

英文:

Designing a good fitness function is not a trivial task, and, in fact, it is usually the central task for creating a successful genetic algorithm. You have not provided much detail, but I will try to provide some useful analysis.

One requirement of a good fitness function is that it must provide a smooth path to where you want to go from at least some of the starting points in gene-space. That is, if you start at a random point in gene-space, can you mostly, incrementally migrate to better and better places in gene-space using the fitness function as a guide.

Looking at your fitness function, my inclination is that the easiest solution to find is simply not moving much. Let's break the fitness function down into its components:

  • 1/(1 + distance-travelled)
  • 1/(1 + distance-to-target)
  • 1/(1 + number-collisions)
  • sensor-average - not sure what this is?

Let's suppose distance-to-target is 10. The first three components are 1 + 1/10 + 1. If you move one-step closer to the target without colliding with anything, the first three components are 1/2 + 1/9 + 1 which is less than before. That is obviously not the feedback you want to be giving for making a correct move.

For this task, I think you want distance-to-target to dominate the fitness, i.e. if you get closer to the target, regardless of the other factors, fitness should increase.

I don't have an ideal fitness function off the top of my head, but I will give it some thought.

Assuming equivalent distance-to-taret's, distance-travelled should be the next most important factor. Assuming two results get to 5 for distance-to-target, the one that travelled less distance should score higher.

number-collision needs to be relative to the distance-travelled, otherwise, you can minimize number-collisions by not traveling.

The idea is that you need to work through the details of the fitness function and make sure that it is always providing the correct feedback. This is not as easy as it sounds since most fitness functions have several variables.

If there is a pernicious solution that finds a local maximum in the fitness function without actually solving the problem, the algorithm will probably find it.

答案2

得分: 1

让我们更仔细地查看您的健身函数,可能更容易阅读(尽管我认为括号也有助于这一点):

(
  (
    1 / (1 + 总行驶距离)
  ) + 
  (
    1 / (1 + 目标距离)
  ) + 
  (
     1 / (1 + 碰撞次数)
  ) + 
  传感器平均值
)

看起来您同样权衡了行驶距离、目标距离和碰撞次数。不确定传感器的sensors.Average()是什么(不熟悉Unity),但我认为这可能不是问题所在。

假设碰撞次数远小于行驶距离或目标距离。假设我们有10次碰撞,已经行驶了100个单位,还需要行驶100个单位。

前两项为0.0099,第三项为0.09。如果我们认为数值越大越好,那么它肯定会更注重避免碰撞,至少在接近目标或发生大量碰撞之前是如此。

更好的选择是将碰撞次数用作系数。类似这样:

(
  (
    (
      1 / (1 + 总行驶距离)
    ) + 
    (
      1 / (1 + 目标距离)
    )
  ) * 
  (
     1 / (1 + 碰撞次数)
  ) + 
  传感器平均值
)

在这种情况下,我们平等考虑了距离,然后将其与碰撞因素相乘。

英文:

Let's take a closer look at your fitness function in a perhaps more readable way (though I figure the parenthesis are there to help with that too):

(
  (
    1 / (1 + totalDistanceTravelled)
  ) + 
  (
    1 / (1 + distanceToTarget)
  ) + 
  (
     1 / (1 + numberOfCollisions)
  ) + 
  sensors.Average()
)

Looks like you are equally weighing distance traveled, distance to target, and number of collisions. Not sure what the sensors.Average() is (not familiar with unity), but I will figure that it's probably not the issue.

Let's assume the number of collisions is much less than the distance traveled or distance to the target. Say we have 10 collisions, we have traveled 100 units and we have 100 units left to travel.

The first two are 0.0099, the second one is 0.09. If we figure higher is better, then it would definitely give priority to avoiding stuff, at least until you are super close to the target or you have had a lot of collisions.

A better option would be to use the number of collisions as a coefficient. Something like:

(
  (
    (
      1 / (1 + totalDistanceTravelled)
    ) + 
    (
      1 / (1 + distanceToTarget)
    )
  ) * 
  (
     1 / (1 + numberOfCollisions)
  ) + 
  sensors.Average()
)

In this case, we equally consider the distances, then we multiply that by the factor of the collisions

huangapple
  • 本文由 发表于 2023年5月18日 06:43:01
  • 转载请务必保留本文链接:https://go.coder-hub.com/76276635.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定