Ability of AI models to execute highly complex tasks has grown rapidly

Maximum complexity of a given task that an AI model can complete successfully*. Complexity measured by number of hours a human would take to complete the same task

Claude Mythos

preview

16 human hours

12

Claude Opus 4.6

Confidence

interval

8

GPT-3.5

High

4

GPT-2

launched in 2019

GPT-3.5

o3

GPT-4

0

2020

2022

2024

2026

AI model launch date

Claude Mythos

preview

16 human hours

12

Claude Opus 4.6

Confidence

interval

8

GPT-3.5

High

4

GPT-2

launched in 2019

GPT-3.5

o3

GPT-4

0

2020

2022

2024

2026

AI model launch date

Claude Mythos

preview

16 human hours

12

Claude Opus 4.6

Confidence

interval

8

GPT-3.5

High

4

GPT-2

launched in 2019

o3

GPT-3.5

GPT-4

0

AI model

launch date

2020

2022

2024

2026

Claude Mythos

preview

16 human hours

12

Claude Opus 4.6

Confidence

interval

8

GPT-3.5

High

4

GPT-2

launched in 2019

o3

GPT-3.5

GPT-4

0

AI model

launch date

   

2020

2022

2024

2026

Claude Mythos

preview

16 human hours

12

Claude Opus 4.6

Confidence

interval

8

GPT-3.5

High

4

GPT-2

launched in 2019

GPT-3.5

o3

GPT-4

0

2020

2022

2024

2026

AI model launch date

Claude Mythos

preview

16 human hours

12

Claude Opus 4.6

Confidence

interval

8

GPT-3.5

High

4

GPT-2

launched in 2019

GPT-3.5

o3

GPT-4

0

2020

2022

2024

2026

AI model launch date

Claude Mythos

preview

16 human hours

12

Claude Opus 4.6

Confidence

interval

8

GPT-3.5

High

4

GPT-2

launched in 2019

o3

GPT-3.5

GPT-4

0

AI model

launch date

2020

2022

2024

2026

Claude Mythos

preview

16 human hours

12

Claude Opus 4.6

Confidence

interval

8

GPT-3.5

High

4

GPT-2

launched in 2019

o3

GPT-3.5

GPT-4

0

AI model

launch date

2020

2022

2024

2026