May be the first time in the history of the GTOC not the team with the best score without bonus wins the competition.
Power 4 in the bonus function is a bit too much, given that the task is so easy (and fast) to solve this time.
Even if Tsinghua later tries to submit a better solution, it will probably not be scored because of the decreasing bonus.
Something they could think about when they design the GTOC11 merit function and leader board. Looking forward to the first chinese GTOC. Another wish for the GTOC11 organizer: Please don't plan it so close to the CTOC as it was this time. Feel a bit exhausted after two competitions in a row.
The bonus has a maximum value of a 2x multiplier. The 4th power is a positive thing because B is already fading quickly to 1. Nobody is even close to solving this problem, and I expect final scores in the high hundreds of thousands compared to the high score of ~300 right now.
Can you elaborate your expectation? What makes you think such high values are possible? My suspicion is that you can derive from the spatial distribution error term for E_r an upper limit for the score. May be its slightly higher than 400 but not much. If you visit many stars you cannot maintain an extremely low dvUsed value.
Dario has shown us all that the simple solutions really can be beaten. But I still think the aggressive bonus function put enormous pressure on the ESA team.
I just performed the following experiment:
Select a random star subset and compute its J-value not considering dv and bonus. Repeat this experiment 10000 times and report the best J together with the size of the best subset. I repeated this this experiment 10 times with the following results:
best J = 26.265 number settlements = 49
best J = 27.892 number settlements = 47
best J = 31.61 number settlements = 75
best J = 31.964 number settlements = 79
best J = 36.669 number settlements = 87
best J = 31.179 number settlements = 97
best J = 30.052 number settlements = 81
best J = 33.733 number settlements = 61
best J = 29.509 number settlements = 97
best J = 28.153 number settlements = 73
Do you see how small the best random subsets are? Probably not what we expected.
The limit of J as E approaches zero is B*N*dVmax/dVused. 1 <= B <= 2, 1 <= N <= 100,000. (dVmax/dVused) <= 1
To be clear, I do expect E to approach zero for the top solutions.
If you can find stars you can reach so that E approaches zero. This is the problem. Stars simply are never on positions so that both E_r and E_theta both are zero.
In fact since R and theta_f are fixed values the error terms for stars don't change over time. There are stars contributing less than others E, but no star contributes zero.
It's possible, I promise! You just won't find those configurations by random sampling.
Its possible to focus on stars with low E, I agree. Something I haven't realised before. Somehow I saw spatial distribution as a relative measure of the distribution of the stars. I really haven't seen that it also affects single stars as it is defined here.
But I still have doubts about the "expect final scores in the high hundreds of thousands". My gut feeling is now something around 500, even after the competition. May be I am wrong.
To get a more realistic upper limit I repeated the experiment above with a sampling which prefers low error stars and tries to maintain a good distribution over different R and theta_f values.
best J = 1185.605 number settlements = 5909
best J = 1195.004 number settlements = 5495
best J = 1183.695 number settlements = 7094
best J = 1229.049 number settlements = 5422
best J = 1219.644 number settlements = 6702
We see now much higher J-scores and optimal numbers of stars. The aspect how to connect the stars by settler ship movements is kept out of this experiment, so expect a huge drop in J when connecting the stars.
But still, since we have more or less bonus factor 1 at the end and limited gains from dVmax / dvUsed,
I think a score > J=1500 is almost impossible at the end.
But a sore of J > 450 at the end would be sufficient to invalidate my initial statement about the aggressiveness of the bonus.
Currently NUDT is at about 270 without bonus, so there is a long way to go, we need 60 % more. And I have to admit, at least theoretically this seems possible.
And what is also clear now: The bonus worked and was necessary. We needed NUDT and ESA being forced to reveal the information, that many star solutions can break the error barrier. Factor 2 also looks reasonable, since it should be possible to improve the best solution by this factor over time. Its more the exponent 4 which is debatable. It strengthened the 5 star solutions, since these are fast to compute. The initial delay of the submission period made it possible to get almost the full factor 2 for these solutions. What if J=500 is really the limit and no one would have been fast enough to find a solution until end of 27-05 ? Then the cheap solutions would win. Lets see what the teams can achieve. Tsinghua also showed a many star solution in the beginning, perhaps they reveal soon what they have until now.
Just found another method to derive the upper limit of the J-score without bonus:
Just use the expertise of the best teams so far (NUDT, ESA) and derive the score limit by applying game theoretical considerations to their observable submission strategy, assuming that they are clever enough to "play the game" in a theoretically optimal way.
To beat Tsinghuas score of 396 at the end you need 396 * 1.0 = 396, a base score without bonus of 396. If NUDT + ESA were confident they can reach this base score, they never would have released their result so early, there is no incentive for them. Best strategy is to reveal only if you have to. So ESA definitely expected a winning base score < 396.
So why did they reveal their base score about 270 now? We cannot assume automatically it is the best they have, we only know that they probably don't have something > 396 now, otherwise they would not have shown anything.
ESA released first. There is no reason to release anything above 397 (including bonus) if you think you can keep up with the decreasing bonus. Which is an indication that they probably expect a winning base score way lower than 396, may be 300? NUDTs move contains less information, it was optimal, and since they had seen ESAs move the upper limit of their expectation is a base core of 432, ESAs score.
Theoretically there could be a "super team" expecting a base score > 500 which still hides what they have or are expecting to have.
All we know is that NUDT is expecting a base score < 432 and ESA one < 396.
What is the information that NUDT + ESA are expecting these scores worth? You have to check their performance in recent GTOCs and CTOCs which reveals they are (beside JPL) simply the best of the best. Which means I trust their assessment. It also means that the probability of the "super team" assumption better then ESA and NUDT is low.
Why is the assumption that both teams are "playing" in a game theoretical optimal way reasonable? For NUDT it is clear that their move (immediate reaction to ESA not revealing anything) is optimal. For ESA I also seriously doubt that they "play a blunder move". Can be that they really needed until now to find their good solution, but they wouldn't have released it if they expect to find a base score > 396.
Which establishes the upper limit of the base score J = 432, maybe even J = 396 without bonus from game theoretical considerations.
Small clarification: "probability of the "super team" assumption better then ESA and NUDT is low"
doesn't mean that the probability that another team wins GTOCX is low, this can definitely happen,
but that I don't think that this hypothetical other winning team expects a base score > 432. Base score
300 would be enough to win.
A new move from ESA: 548.668927 congratulations. From game theoretical perspective we now have to increase the maximal expected base score of ESA to 441, NUDTs score. They played fast, but increased above 442. Only explanation is they expect this to be the better move since they want to profit from the current bonus. Means they doubt they can compensate the declining bonus in the future, which indicates a slighly lower expectation for their final score than 441.
I think people are confident that their scores will be beaten pretty soon, so it doesn't really matter whether they show their cards or not. The leaderboards is part of the fun! I'm not sure exactly how long it will take to get to the hundred-thousands, but if people aren't over 10k by a week from now, I'll be very disappointed. Let's see how our predictions shape up!
This is a really interesting Merit function. For testing purposes, I caught 20 stars (9 pods, 2 fast ships, and 9 settlers), and got the last place (even after those, who got only 1 star!) :)
Dear organizers, could you please clarify the idea of this kind of Merit function (considering, that the dV's are also restricted)? I suppose, there is no sense to try to settle all 100,000 stars...
I don't want to say too much, but let me write out plainly what we all mostly know. There are basically three knobs to turn, the number of stars, the error, and the dv ratio. Plus, when you turn one knob, you change how hard it is to turn another knob. The error knob basically tells you if your stars are well distributed. Will the globally optimal solution have one of those knobs turned to its limit? If not, how far from the limit should it be? The answers to these questions (especially the latter) are not immediately obvious. That's what the teams will find out!
"people are confident that their scores will be beaten pretty soon, so it doesn't really matter whether they show their cards or not".
Top teams like ESA and NUDT cannot simply assume that others are better. Look at ESAs GTOC record. They usually won. There is no reason for them to assume that the expectation their own score will grow is lower than the expectations others have. If they submit, they influence what others think is possible which strengthens their competitors. After ESAs submission I saw in minutes that something in what I did so far was wrong. And found the problem soon. Knowing that there is a problem makes it much easier to find it. Of course submitting makes the competition more interesting although it is wrong from a game theoretical point of view. Could be that ESA feels so strong that they don't think they have to "play" an optimal strategy. To justify the "10k by a week" assumption you need to find a counter argument to the 2nd random
sampling experiment above where I preferred low error stars and maintained a good distribution over different R and theta_f values. Or, in Anastassios terms: I took the dv knob out of the game and determined the optimal error knob / number of star knob locations then. And we know: To turn the dv knob to its limit, you cannot have many stars.
But back to the original discussion about the bonus function. If a team can beat the initial basic score (without bonus) "break throughs" to the error barrier from ESA and NUDT, which is about J = B * 275, they have only about 1.5 days left to beat Tsinghua, the highest scored 5 star solution. NUDT shows progress today (congratulations!) but it is mostly eaten away by the rapidly declining bonus. These observations are still concerning. Hopefully we soon see other teams break through, otherwise we finally could see only two break throughs in the leader board, all the others killed by the declining bonus function.
And with ESAs current top winning base score (J = B * 349) you land at rank 9 beating NASA MSFC even 9 days before the competition is finished. Don't underestimate the power to 4 in the bonus function.
Please log in to leave a comment.