Discussion

Home     Discussion Board      AI Singapore Trusted Media Challenge      Rules about logs and external data

lvxiaoxin

Rules about logs and external data

posted in   AI Singapore Trusted Media Challenge

2021-12-17 08:25

23  comments

None

aisg_billong

2021-12-22 01:42

Reply

1
<p>Hi, could we trouble you to further clarify your request, and what you might mean by “final rules” please? Thank you.</p>
  • lvxiaoxin reply aisg_billong

    2021-12-22 02:14

    Reply

    0
    <p>Hi, I Copy this from Dr Chan’s request:‘Thanks for the competition. We hope that the organizers will have a thorough review through the team log activities in order to ensure a fair competition. Hope that the teams are not probing, reusing printed out predictions scores, features, outcomes and including them in their subsequent predictions. Their outcome of AUC >0.98 is respectable. Training on external commercial datasets are also not allowed which we have not done so. Once again, appreciate the effort in hosting this tournament. Thank you. Dr Chan’</p>
  • lvxiaoxin reply aisg_billong

    2021-12-22 02:25

    Reply

    0
    <p>Thanks for this competition!! All of us did a great job!</p>
  • lvxiaoxin reply aisg_billong

    2021-12-24 02:24

    Reply

    1
    <p>Can we know are you working on this? Maybe you can just answer this:</p> <ol> <li>Are there top teams logging information other than debug errors during phase1 or phase2?<br>If the answer is No, score of AUC >0.98 is really respectable. Thanks again.</li></ol>
  • petergro reply lvxiaoxin

    2021-12-24 02:52

    Reply

    0
    <p>Team IVRL here - If it helps I can tell you that every final result (used for the ranking) will have to be fully reproducible according to the T&C and will be checked step by step, according to our last communications. Should anything arise I assume everybody will be notified, however this might take a while due to the time it takes to replicate it.<br>Furthermore, and this will sound biased coming from a fellow competitor, I can tell you that we do not make use of any logging info for our score and there is no specific behavior coded for any specific file, nor was there any usage of private/commercial datasets.<br>The AUC scores being so high, I think some of these high scores could also be due to some luck, as we’ve definitely reached a level of saturation.</p>
  • None

    aisg_billong reply pchankh

    2022-01-17 02:20

    Reply

    1
    <p>Hi, as mentioned in the earlier post, the models of the top 3 teams are going through a reproducibility check and are being reviewed by an expert panel. Any potential cheating behaviour would be flagged and dealt with.</p> <p>You may wish to email your request to <a href="mailto:prizechallenge@aisingapore.org.">prizechallenge@aisingapore.org.</a></p> <p>Also, we noted that you have made the same post on another thread. Please refrain from making the same post on multiple threads. Thank you.</p>
  • fizzbuzz reply petergro

    2021-12-24 03:11

    Reply

    0
    <p>Hi, in the final week of the competition our team found out that we can actually log the outputs e.g. model probability and scores (We tested this locally). With these information in the log, we believe that this is a severe leakage of the test data, and this will also allows us to further re-calibrate and fine-tune our model to fit the test data distribution. <strong>NOTE</strong>: Our team were very concern about this and have raised this in a separate discussion earlier at <a href="https://trustedmedia.aisingapore.org/forum/view_post_category/120/">https://trustedmedia.aisingapore.org/forum/view_post_category/120/</a> no green light from the organizers, our team have decided <strong>NOT</strong> to logged any derived outputs from our model in our submission..</p>
  • petergro reply fizzbuzz

    2021-12-24 03:24

    Reply

    0
    <p>From my understanding, this case would be caught by the reproducibility check, as the training pipelines will also be checked. It’s not only the final models that are checked. At least according to the latest communication.</p>
  • lvxiaoxin reply petergro

    2021-12-24 03:42

    Reply

    0
    <p>Thanks for you reply.</p> <p>“I can tell you that we do not make use of any logging info for our score and there is no specific behavior coded for any specific file” I am not good at english, do you mean you did NOT log any predict info? </p> <p>‘this will sound biased coming from a fellow competitor’ sorry for this, but I don’t know why if you did NOT log any predict info. </p>
  • lvxiaoxin reply fizzbuzz

    2021-12-24 04:28

    Reply

    0
    <p>I think I just need to know which model I should optimaze during Phase2, or check distribution of those four cases on testset. I alse doult that maybe ratio of ‘real’ samples on testset is smaller than training set. I think I can easily get those information by logging some info without wasting my submission. </p>

pchankh

2021-12-24 10:07

Reply

1
<p>Dear organizers,<br>If the teams are logging the model predictions and syncnet scores to the log, they can potentially do many tasks offline and have a significant unfair advantage (e.g. sampling ensembling test, building external models from the model scores and probing the leaderboard scores and many other tasks are possible.). Any logging of model predictions should be and must be a violation of the fair competition! We sincerely believe that the organizers will do a good job perusing the logs and the source code of the participants. Once again, thank you for the competition! Merry Christmas Dr Chan</p>
  • None

    aisg_billong reply pchankh

    2022-01-10 00:05

    Reply

    2
    <p>Hi, The models of the top 3 teams are going through a reproducibility check and are being reviewed by an expert panel. Any potential cheating behaviour would be flagged and dealt with. Thank you.</p>

pchankh

2022-01-15 04:02

Reply

2
<p>Dear organizers,</p> <p>We understand that there are unfair practices of printing out prediction model and syncnet scores for each test samples from Phase 1 to Phase 2. We seek your fair assessment of such acts of probing as they can have unfair advantages of tuning their models offline on this. Our team has highlighted this earlier to the organizers early in the competition. It is important to review these unfair means properly as it will erode the confidence of AI competition in Singapore going forward. We like to request a conversation with the organizers on above matter. Best regards, Dr Chan</p>