EA - Scoring forecasts from the 2016 â€œExpert Survey on Progress in AIâ€ by PatrickL

The Nonlinear Library: EA Forum - En podcast av The Nonlinear Fund

Kategorier:

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Scoring forecasts from the 2016 â€œExpert Survey on Progress in AIâ€, published by PatrickL on March 1, 2023 on The Effective Altruism Forum.SummaryThis document looks at the predictions made by AI experts in The 2016 Expert Survey on Progress in AI, analyses the predictions on â€˜Narrow tasksâ€™, and gives a Brier score to the median of the expertsâ€™ predictions.My analysis suggests that the experts did a fairly good job of forecasting (Brier score = 0.19), and would have been less accurate if they had predicted each development in AI to generally come, by a factor of 1.5, later (Brier score = 0.26) or sooner (Brier score = 0.27) than they actually predicted.I judge that the experts expected 9 milestones to have happened by now - and that 10 milestones have now happened.But there are important caveats to this, such as:I have only analysed whether milestones have been publicly met. AI labs may have achieved more milestones in private this year without disclosing them. This means my analysis of how many milestones have been met is probably conservative.I have taken the point probabilities given, rather than estimating probability distributions for each milestone, meaning I often round down, which skews the expert forecasts towards being more conservative and unfairly penalises their forecasts for low precision.Itâ€™s not apparent that forecasting accuracy on these nearer-term questions is very predictive of forecasting accuracy on the longer-term questions.My judgements regarding which forecasting questions have resolved positively vs negatively were somewhat subjective (justifications for each question in the separate appendix).IntroductionIn 2016, AI Impacts published The Expert Survey on Progress in AI: a survey of machine learning researchers, asking for their predictions about when various AI developments will occur. The results have been used to inform general and expert opinions on AI timelines.The survey largely focused on timelines for general/human-level artificial intelligence (median forecast of 2056). However, included in this survey were a collection of questions about shorter-term milestones in AI. Some of these forecasts are now resolvable. Measuring how accurate these shorter-term forecasts have been is probably somewhat informative of how accurate the longer-term forecasts are. More broadly, the accuracy of these shorter-term forecasts seems somewhat informative of how accurate ML researchers' views are in general. So, how have the experts done so far?FindingsI analysed the 32 â€˜Narrow tasksâ€™ to which the following question was asked:How many years until you think the following AI tasks will be feasible with:a small chance (10%)?an even chance (50%)?a high chance (90%)?Let a task be â€˜feasibleâ€™ if one of the best resourced labs could implement it in less than a year if they chose to. Ignore the question of whether they would choose to.I interpret â€˜feasibleâ€™ as whether, in â€˜less than a yearâ€™ before now, any AI models had passed these milestones, and this was disclosed publicly. Since it is now (February 2023) 6.5 years since this survey, I am therefore looking at any forecasts for events happening within 5.5 years of the survey.Across these milestones, I judge that 10 have now happened and 22 have not happened. My 90% confidence interval is that 7-15 of them have now happened. A full description of milestones, and justification of my judgments, are in the appendix (separate doc).The experts forecast that:4 milestones had a 90% chance.So they expected 6-17 of these milestones to have happened by now. By eyeballing the forecasts for each milestone, my estimate is that they expected ~9 to have happened. I did not estimate the implied probability distribut...

Visit the podcast's native language site