Since the conference papers of the GTOCX participants are available, we can use the ideas described there to develop a reference implementation to be used to determine the specific strengths and weaknesses of the participants. Who has the best mother ship/fast ship root tree, who is good at the expansion phase, who excels at the final optimization? Is there a "Dream Team" - a combination of teams which can possibly together enhance the best result so far?
We distinguish three phases:
1) start - root tree consisting of all mother ship and fast ship transfers.
2) expansion - expansion of the root tree by adding nodes
3) optimization - replacement of stars at any node + removing / adding of leave nodes + timing adjustments
The root trees can be extracted from the submitted solutions. Then we can replace expansion and optimization by a reference implementation to evaluate the performance in a specific phase.
We determine the following three values denoted as PPP, PRR and PPPR:
- PPP = participant start, expansion and optimization
This is the submission score of the participant.
- PRR = the participant start, then reference expansion and optimization
Here we replace expansion and optimization by the reference implementation.
- PPPR = participant start, expansion and optimization, then reference optimization
The submission of the participant further enhanced by the reference optimization.
Now we can determine the performance of a participant in a specific phase by:
start : dependent on PRR
Only the start is used from the participant solution for PRR.
expansion: depend on PPPR / PRR
Checks what happens if we replace the reference expansion by the one from the participant. Since PPR is not available - we don't have the intermediate tree after the participant expansion - we have to use PPPR as an approximation.
optimization: dependent on PPP / PPPR
Does it help when we additionally apply the reference optimization? If yes, this indicates there "is room for improvement" for the participant optimization.
Results using Jenas current reference implementation:
Participant | Start| Exp | Opt | PPP | PRR | PPPR |
NUDT | 100% | 100% | 98% | 3101 | 3114 | 3323 |
Tsinghua | 68% | 99% | 97% | 2070 | 2108 | 2232 |
ESA | 77% | 97% | 85% | 1996 | 2390 | 2471 |
Aerospace | 74% | 90% | 74% | 1559 | 2309 | 2214 |
HIT_BACC| 68% | 80% | 68% | 1167 | 2110 | 1799 |
CSU | 38% | 92% | 100% | 1111 | 1190 | 1164 |
worhp2orb | 35% | 94% | 83% | 873 | 1098 | 1104 |
Nudts 4000 cores decided the competition, a crushing defeat for all others at the start (motherships + fast ships). Jenas reference implementation overall performs similar to NUDTs, its strength is optimization where NUDT has a better expansion. Beside NUDT also Tsinghua and ESA, and even worhp2orb outperform the reference expansion. ESA and Aerospace missed the chance for a 2nd place after a great start. Tsinghua, despite its start handicap fought hard to finally become 2nd. Even HIT_BACC had a slightly better start than Tsinghua, but failed to exploit its potential. CSUs optimization is suprizingly good, CSU is the only team able to narrowly beat NUDT in one dicipline. My suspicion is that NUDTs optimization is constrained by the fact that they use a limited set of stars.
Possible "Dream Team" candidates (> 95%) would be
expansion: NUDT or TSINGHUA or ESA
optimization: CSU or NUDT or TSINGHUA
Would be interesting if another team could use its own code as alternative reference so we could compare results.
Interesting results! Did you validate your solutions? Details on your reference expansion and optimisation?
I could only validate using my own implementation - an implementation which led to a validated solution. I see the following paths to a better after competition validation:
a) Anastassios implements an after competition leaderboard using the existing validator.
b) Anastassios reactivates the upload / verification of new solutions without a leaderboard.
c) Anastassios implements a new web interface to the the existing validator so that we still can validate.
d) Some other team implements a web interface to their own validator
e) Someone creates a Matlab validator as it was done for CTOC10 and shares this with the other teams.The initial
CTOC10 validator was buggy, but with the help of other teams such bugs can be fixed.
Would like to hear from other teams what they would prefer.
Will write some more details about an improved version of the reference expansion in the following thread about J=4000.
If there are questions about specific details of the algorithm I will answer them.
In the meantime I improved my own verification method and found small constraint violations. Initially I thought these can be fixed by shifting the timings, but at that level (> 3800) these are already very tight. So the results shown above are mainly interesting to compare the mothership/fastship routes of the different teams, verifiable results are probably about 3-4% lower. The best verified solution has J = 3847 using 3677 stars, so the J = 4000 goal is still open. If anyone is interested to verify the J = 3847 solution please send me an email.
Please log in to leave a comment.