From f8e91c29d68b22015cec2d6ede6881846184e72e Mon Sep 17 00:00:00 2001 From: JJJHolscher Date: Thu, 2 Nov 2023 12:06:29 +0100 Subject: [PATCH] automated deployment --- AIS brief voor politieke partijen.html | 391 +++++++++ aisfcw7.html | 412 +++++++++ goal misgeneralization.html | 357 ++++++++ hello.html | 8 +- index.html | 42 +- index.xml | 31 +- inner misalignment.html | 348 ++++++++ j/2023-11-02.html | 818 ++++++++++++++++++ j/index.html | 818 ++++++++++++++++++ j/index.xml | 33 + listings.json | 15 +- longtermism.html | 357 ++++++++ open confusion.html | 46 +- outer misalignment.html | 348 ++++++++ polarization.html | 351 ++++++++ qry/hello.html | 818 ------------------ reward hacking.html | 350 ++++++++ sae.html | 362 ++++++++ search.json | 167 +++- sitemap.xml | 70 +- x/hello.html | 818 ++++++++++++++++++ .../figure-html/fig-polar-output-1.png | Bin xai.html | 357 ++++++++ 23 files changed, 6424 insertions(+), 893 deletions(-) create mode 100644 AIS brief voor politieke partijen.html create mode 100644 aisfcw7.html create mode 100644 goal misgeneralization.html create mode 100644 inner misalignment.html create mode 100644 j/2023-11-02.html create mode 100644 j/index.html create mode 100644 j/index.xml create mode 100644 longtermism.html create mode 100644 outer misalignment.html create mode 100644 polarization.html delete mode 100644 qry/hello.html create mode 100644 reward hacking.html create mode 100644 sae.html create mode 100644 x/hello.html rename {qry => x}/hello_files/figure-html/fig-polar-output-1.png (100%) create mode 100644 xai.html diff --git a/AIS brief voor politieke partijen.html b/AIS brief voor politieke partijen.html new file mode 100644 index 0000000..1e619b7 --- /dev/null +++ b/AIS brief voor politieke partijen.html @@ -0,0 +1,391 @@ + + + + + + + + + + +ais-brief-voor-politieke-partijen + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + + +
+ + + +

Beste vertegenwoordiger van [politieke partij],

+

Deze verkiezingen stem ik voor veilige kunstmatige intelligentie (AI Safety ofwel AIS).
+Ik schrijf om mijn zorgen te uiten en te vragen wat voor rol AIS speelt in de plannen van [politieke partij].

+

Mijn zorg is dat de mensheid controle verliest over toekomstige systemen en dat we weinig tijd hebben voor een oplossing.

+
+

De tijd totdat kunstmatige intelligentie mensen evenaart

+

Voordat GPT-4 was uitgebracht waren 300 experts geïnterviewd over hun verwachtingen voor toekomstige kunstmatige intelligentie.
+De meerderheid verwacht dat toekomstige kunstmatige intelligentie mensen zal evenaren op de meeste vlakken.
+Geaggregeert, verwachten ze dat dit gebeurd rond 2060. Maar voor velen was het niveau van GPT-4 een stuk hoger dan ze verwachtten.

+

Andere authoriteiten verwachten dat KI mensen een stuk eerder zal passeren.
+Sam Altman, de CEO van OpenAI denkt dat dit al kan gebeuren binnen 10 jaar.
+Dario Amodel, de CEO van Anthropic denkt dat dit binnen 2 jaar kan gebeuren.
+Shane Legg oprichter van DeepMind, denkt dat er een 50% kans is dat KI mensen zal evenaren tijdens 2028.

+

Voorspellings markten, waar mensen wedden op toekomstige gebeurtenissen, doen ook voorspellingen op de snelheid waarmee kunstmatige intelligentie verbetert.
+De voorspellers op Metaculus verwachten dat KI, veel menselijke taken beter doet dan mensen rond 2032. Het is in de geschiedenis van die markt te zien dat de introductie van GPT-4, 8 jaar van die voorspelling af haalde.

+
+
+

Intelligentie Explosie

+

Op een gegeven moment zal KI krachtig genoeg worden dat het zelf kan bijdragen aan verbeteringen binnen KI.
+Waarschijnlijk gebeurt dit rond het punt dat KI de meeste menselijke taken even goed kan uitvoeren als mensen.
+Wanneer KI automatisch zichzelf verbetert, kunnen er jaren aan menselijk onderzoek binnen maanden of dagen door een computer wordt gedaan zonder menselijk toezicht.
+Deze intelligentie explosie heeft een 50% kans volgens de 300 experts.

+

Dit betekent dat kort na menselijke KI, er kunstmatige superintelligentie kan onstaan. Het is niet te voorspellen hoe krachtig dit zal zijn.
+Manifold, een andere voorspellings markt, verwacht een 40% kans dat er superintelligentie is in 2030.

+
+
+

Open problemen voor het controleren van toekomstige KI

+

Momenteel heeft de samenleving al moeite met het bijbenen van kunstmatige systemen.

+
    +
  • Beeldmateriaal wordt steeds makkelijker te falsifieren.
  • +
  • Chatbots kunnen massaal nepnieuws verspreiden.
  • +
  • Autonome wapens verschijnen.
  • +
  • Het werk van artiesten kan eenvoudig nagebootst worden.
  • +
+

Maar deze problemen hebben nog altijd een mens aan de oorsprong staan.
+Naarmate KI bekwamer wordt, onstaan er risikos die zelfs de maker niet bedoelt.
+Daarom trekken voraanstaande KI experts aan de bel nu.

+ + + +
+ +
+ +
+ + + + \ No newline at end of file diff --git a/aisfcw7.html b/aisfcw7.html new file mode 100644 index 0000000..2d6f948 --- /dev/null +++ b/aisfcw7.html @@ -0,0 +1,412 @@ + + + + + + + + + + +aisfcw7 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + + +
+ + + +
+

AI Governance

+

Before I summarize these readings, I’ll write down my views so I might reflect on how these readings change them.

+

In the limit of time, AI Governance is harder than alignment.
+Research will proceed, even under the strictest of moretoria.
+There is a decent chance superhuman AI could run on current hardware[?], so when that software is found then AGI can only be prevented in cases where all compute is monitored.
+If that software doesn’t get found, then you can still survive with course-grained monitoring.

+

This monitoring will be imperfect. When devices have internet access, it is hard to know which fragments of compute belong to the same “chunk”. Even without internet access, programs can communicate with each other in unorthodox ways. Currently it is very hard however to train a model in a distributed fashion.

+

As time passes, AGI will become increasingly hard to prevent.
+Until that time, governance can stall and can build up a safety-minded culture.

+
+

AI Governance: Opportunity and Theory of Impact

+

Summary of this.

+
+

AI governance is a new field and is relatively neglected.

+
+

This paper is written in 2020. “Neglected” is an overstatement but right now none of my political parties have a stance on X-risk still.

+
+

this piece is primarily aimed at a longtermist perspective

+
+

I’ve heard this term come up less and less. Most longtermist cause areas actually have an expected impact within our lifetimes. AIS is no exception.

+
+

We see this scramble in contemporary international tax law, competition/antitrust policy, innovation policy, and national security motivated controls on trade and investment.

+
+
+

2 problems

+

The problem of managing AI competition:
+> Problems of building safe superintelligence are made all the more difficult if the researchers, labs, companies, and countries developing advanced AI perceive themselves to be in an intense winner-take-all race with each other, since then each developer will face a strong incentive to “cut corners”

+

The problem of constitution design:
+> A subsequent governance problem concerns how the developer should institutionalize control over and share the bounty from its superintelligence;

+
+
+

3 perspectives

+

Superintelligence

+

Ecology
+> a diverse, global, ecology of AI systems. Some may be like agents, but others may be more like complex services, systems, or corporations. These systems, individually or in collaboration with humans, could give rise to cognitive capabilities in strategically important tasks that exceed what humans are otherwise capable of

+

General Purpose Technology, tool AI

+
+
+

risks

+

Misuse and accident risks are associated with ASI.
+> These lenses typically identify the opportunity for safety interventions to be causally proximate to the harm: right before the system is deployed or used there was an opportunity for someone to avert the disaster through better motivation or insight.

+

Structural risks can be associated with the ecology and GPT perspectives.
+> we see that technology can produce social harms, or fail to have its benefits realized, because of a host of structural dynamics

+

These structural risks might not be existential threats on their own. But they can be “existential risk factors”. They indirectly affect X-risk.

+
+
+

pathways to x-risk

+
+

Relatively mundane changes in sensor technology, cyberweapons, and autonomous weapons could increase the risk of nuclear war

+
+
+

Technology can lead to a general turbulence.

+
+
+

The world could become much more unequal, undemocratic, and inhospitable to human labor

+
+
+

the spectre of mass manipulation through psychological profiling as advertised by Cambridge Analytica hovers on the horizon. A decline in the ability of the world’s advanced democracies to deliberate competently would lower the chances that these countries could competently shape the development of advanced AI.

+
+

And finally, if there is sufficiently intense competition:
+> a tradeoff between any human value and competitive performance incentivize decision makers to sacrifice that value.

+
+
+

theory of impact

+ + +
+
+
+ +
+ +
+ + + + \ No newline at end of file diff --git a/goal misgeneralization.html b/goal misgeneralization.html new file mode 100644 index 0000000..af91c9d --- /dev/null +++ b/goal misgeneralization.html @@ -0,0 +1,357 @@ + + + + + + + + + + +goal-misgeneralization + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + + +
+ + + +
+

Goal Misgeneralization

+

The range of environments in which an AI’s behavior is different from its training environment.

+
+

But this includes environments in which the AI acts uncapably.

+
+
+

Goal-directedness is an underdefined concept

+

Robin Shah et al. use “how easy a model can be fine tuned to some task” as a measure for the degree of that models capability for that task.
+I don’t like this tuneableness.

+

Langosco et al might have a better definition but I have to check that out still.

+ + +
+
+ +
+ +
+ + + + \ No newline at end of file diff --git a/hello.html b/hello.html index dc87de4..9105ffe 100644 --- a/hello.html +++ b/hello.html @@ -7,7 +7,7 @@ -MouseTrap - Quarto Basics +mousetrap - Quarto Basics + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + + +
+ + + +
+

Inner Misalignment

+

I use this interchangably with goal misgeneralization. But this is not universal.

+ + +
+ +
+ +
+ + + + \ No newline at end of file diff --git a/j/2023-11-02.html b/j/2023-11-02.html new file mode 100644 index 0000000..a014919 --- /dev/null +++ b/j/2023-11-02.html @@ -0,0 +1,818 @@ + + + + + Protected Page + + + + + + + + + + + + + +
+
+
+ + + + + + diff --git a/j/index.html b/j/index.html new file mode 100644 index 0000000..ef0e439 --- /dev/null +++ b/j/index.html @@ -0,0 +1,818 @@ + + + + + Protected Page + + + + + + + + + + + + + +
+
+
+ + + + + + diff --git a/j/index.xml b/j/index.xml new file mode 100644 index 0000000..f968b43 --- /dev/null +++ b/j/index.xml @@ -0,0 +1,33 @@ + + + +mousetrap +https://www.mousetrap.blog/j/index.html + + +quarto-1.3.450 +Thu, 02 Nov 2023 11:06:22 GMT + + + https://www.mousetrap.blog/j/2023-11-02.html + +

2023-11-03

+

Today I set up a beeminder pledge to occasionally post about my progress here.

+ + + + + ]]>
+ https://www.mousetrap.blog/j/2023-11-02.html + Thu, 02 Nov 2023 11:06:22 GMT +
+
+
diff --git a/listings.json b/listings.json index 2240c46..a59bba8 100644 --- a/listings.json +++ b/listings.json @@ -1,13 +1,20 @@ [ + { + "listing": "/open confusion.html", + "items": [ + "/xai.html" + ] + }, { "listing": "/index.html", - "items": [] + "items": [ + "/longtermism.html" + ] }, { - "listing": "/open confusion.html", + "listing": "/j/index.html", "items": [ - "/pascals wager.html", - "/threat.html" + "/j/2023-11-02.html" ] } ] \ No newline at end of file diff --git a/longtermism.html b/longtermism.html new file mode 100644 index 0000000..c2c1e61 --- /dev/null +++ b/longtermism.html @@ -0,0 +1,357 @@ + + + + + + + + + + +longtermism + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + + +
+ + + +
+

Longtermism

+

Caring about the long term future.
+Critique about longtermism is that most of its cause areas are about problems which current humans will live to face.

+
    +
  • AI safety
  • +
  • preparing for (artificial) pandemics
  • +
  • resolving the climate crisis
  • +
+

The only area I can come up with that is true long term, is the idea that we should stop mining coal in case our civilisation ends and the next one needs coal in order to advance their tech tree.

+

I think AGI is the last problem humanity has to solve. Solving problems that will happen after AGI arrives is wasted resources since AGI will find a better solution using less resources.

+

If AGI is not aligned, I expect human extinction. There is no point in leaving presents for future civilisations then.

+ + +
+ +
+ +
+ + + + \ No newline at end of file diff --git a/open confusion.html b/open confusion.html index 8daa2fd..3e62385 100644 --- a/open confusion.html +++ b/open confusion.html @@ -128,63 +128,31 @@

Open Confusions

-
+
-
-
-
-


-


-

-
- -