How much can we learn about AI with interpretability tools?

outline of answer:

  1. How much can we learn with current interpretability tools

    1. interpretability research is still very new, so there is a lot we can’t know

    2. some examples of successfully interpretability

    3. is interpretability only post facto justifaction

  2. How much can we learn in principle



AISafety.info

AISafety.info is a project founded by Rob Miles. The website is maintained by a global team of specialists and volunteers from various backgrounds who want to ensure that the effects of future AI are beneficial rather than catastrophic.

© AISafety.info, 2022—1970

Aisafety.info is an Ashgro Inc Project. Ashgro Inc (EIN: 88-4232889) is a 501(c)(3) Public Charity incorporated in Delaware.