-
Notifications
You must be signed in to change notification settings - Fork 50
/
Copy pathsentences.txt
8485 lines (5903 loc) · 297 KB
/
sentences.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
We call ourselves {\it Homo sapiens}\itindex{Homo sapiens}---man the
wise---because our \newtermi{intelligence} is so important to us.
The field of
\newterm{artificial intelligence}, or
AI} attempts not just to understand but
also to {\em build} intelligent entities.
Some have defined intelligence and measured success in terms of of fidelity
to {\em human} performance, while others an abstract,
formal definition of intelligence, called
\newterm{rationality}---loosely speaking, doing the
``right thing.''
\subsection{Acting humanly: The Turing Test approach}
The \newterm{Turing Test}, proposed by
Alan
\plainnewtermitem{natural language processing} to enable it to
communicate successfully in English;
\plainnewtermitem{knowledge representation}
to store what it knows or hears;
\plainnewtermitem{automated reasoning} to answer
questions and to draw new conclusions;
\plainnewtermitem{machine learning} to adapt to new
circumstances and to
detect and extrapolate patterns.
However, the so-called
\newterm{total Turing Test} includes
a video signal so that the interrogator can test the subject's
perceptual abilities, as well as the opportunity for the interrogator to
pass physical objects ``through the hatch.''
To pass the total Turing Test,
the computer will need
\plainnewtermitem{computer vision}\index{computer
vision} to perceive objects, and
\plainnewtermitem{robotics} to manipulate objects and move about.
The interdisciplinary field of \newterm{cognitive
science} brings together computer models from AI and
experimental techniques from psychology to construct precise and testable
theories of the human mind.
His \newterm[syllogism]{syllogisms} provided patterns for argument
structures that always yielded correct conclusions when given correct
premises---for example, ``Socrates is a man; all men are
mortal; therefore, Socrates is mortal.''
These laws of thought were supposed to govern the
operation of the mind; their study initiated the field called
\newterm{logic}.
The so-called
\newterm{logicist} tradition within artificial intelligence
hopes to build on such programs to create intelligent systems.
The theory of
\newterm{probability} fills this gap, allowing rigorous
reasoning with uncertain information.
\subsection{Acting rationally: The rational agent approach}
An \newterm{agent} is just something that acts ({\it
agent} comes from the Latin {\it agere}, to do).
A \newterm{rational agent} is one that acts so as
to achieve the best outcome or, when there is uncertainty, the best
expected outcome.
In other words, we define artificial intelligence as the study and
construction of agents that \newtermi{do the right thing}.
Chapters \ref{game-playing-chapter}
and~\ref{complex-decisions-chapter} deal with the issue of \newtermi{limited
rationality}---acting appropriately when there is
not enough time to do all the computations one might like.
Ultimately, we want
agents that are \newterm{provably beneficial}\ntindex{artificial
intelligence!provably beneficial} to humans.
Like Aristotle\nindex{Aristotle} and Leibniz, Descartes was a strong advocate of
the power of reasoning in understanding the world, a philosophy now
called \newtermi{rationalism}.
But Descartes was also a proponent of
\newterm{dualism}.
An alternative to
dualism is \termi{materialism} or \newtermi{naturalism}, which holds that
the brain's operation according to the laws of physics {\em
constitutes} the mind.
The
\newterm{empiricism} movement, starting with Francis
Bacon's\nindex{Bacon, F.} (1561--1626) {\it Novum Organum},\footnote{The
{\em Novum Organum} is an update of Aristotle's\nindex{Aristotle} {\em
Organon}, or instrument of thought.}
David Hume's
(1711--1776) {\it A Treatise of Human Nature}~\cite{Hume:1739} proposed
what is now known as the principle of
\newterm{induction}: that general rules are acquired
by exposure to repeated associations between their elements.
(1872--1970), the famous Vienna
Circle \cite{Sigmund:2017}, a group of philosophers and mathematicians
meeting in Vienna in the 1920s and 1930s, developed the doctrine of
\newterm{logical positivism}\ntindex{logical
positivism}.
This doctrine holds that all
knowledge can be characterized by logical theories connected, ultimately,
to \newterm{observation sentences}\tindex{observation sentences} that
correspond to sensory inputs; thus logical positivism combines
rationalism and empiricism.
The \newterm{confirmation
theory} of Rudolf Carnap\nindex{Carnap, R.}
(1891--1970) and Carl Hempel\nindex{Hempel, C.} (1905--1997) attempted to
analyze the acquisition of knowledge from experience by quantifying the
degree of belief that should be assigned to logical sentences based on
their connection to observations that confirm or disconfirm them.
Jeremy Bentham~\citeyear{Bentham:1823} and
John Stuart Mill~\citeyear{Mill:1863} promoted the idea of
\newtermi{utilitarianism}; that rational decision making should apply to
all spheres of human activity.
The idea of \newtermi{formal logic} can be traced back to the
philosophers of ancient Greece, but its mathematical development
really began with the work of George Boole\nindex{Boole, G.}
(1815--1864), who worked out the details of propositional, or Boolean,
logic~\cite{Boole:1847}.
The theory of \newtermi{probability} can be seen as generalizing logic to
situations with uncertain information---a consideration of critical importance
for AI.
The formalization of probability, combined with the availability of data, led
to the emergence of \newtermi{statistics} as a field.
The history of computation is as old as the history of numbers, but the first
nontrivial \newterm{algorithm} is thought to be Euclid's
algorithm for computing greatest common divisors.
His \newterm{incompleteness
theorem}\tindex{incompleteness!theorem}\tindex{theorem!incompleteness} showed
that in any formal theory as strong as Peano arithmetic (the elementary theory
of natural numbers), there are true statements that are
undecidable in the sense that they have no proof within
the theory.
(1912--1954) to try to
characterize exactly which functions {\em are} \newterm{computable}---capable
of being computed by an effective procedure.
Although decidability and computability are important to an understanding of
computation, the notion of \newterm{tractability}\ntindex{tractability of
inference} has had an even greater impact.
The theory of \newterm{NP-completeness}\tindex{NP-complete},
pioneered by Steven Cook~\citeyear{Cook:1971} and Richard
Karp~\citeyear{Karp:1972}, provides a basis for analyzing the tractability of
problems: any problem class to which the class of NP-complete problems can be
reduced is likely to be intractable.
He proposed instead a principle based on maximization of
expected \newtermi{utility}, an internal, subjective quantity, and explained
human investment choices by proposing that the marginal utility of an
additional quantity of money diminished as one acquired more money.
\newterm{Decision theory}, which combines probability
theory with utility theory, provides a formal and complete framework for
decisions (economic or otherwise) made under uncertainty---that is, in cases
where probabilistic descriptions appropriately capture the decision maker's
environment.
Von Neumann and Morgenstern's development of
\newterm{game theory}\tindex{game theory}~\cite<see also>{Luce+Raiffa:1957}
included the surprising result that, for some games, a rational agent should
adopt policies that are (or least appear to be) randomized.
This topic was pursued in the field of
\newterm{operations research}, which emerged in World War
II from efforts in Britain to optimize radar
installations, and later found civilian applications in complex
management decisions.
(1916--2001) won the Nobel Prize in economics in
1978 for his early work showing that models based on
\newtermi{satisficing}---making decisions that are ``good enough,''
rather than laboriously calculating an optimal decision---gave a better
description of actual human behavior \cite{Simon:1947}.
\newterm{Neuroscience} is the study of the
nervous system, particularly the brain.
By that time, it was known that the brain consisted largely of nerve cells,
or \newterm[neuron]{neurons}, but it was not until 1873 that
Camillo Golgi (1843--1926)\nindex{Golgi, C.} developed a staining
technique allowing the observation of individual neurons in the brain
(see \figref{neuron-figure}).
These are augmented by advances in
single-cell electrical recording of neuron activity and by the methods
of \newtermi{optogenetics}~\cite{Crick:1999,Zemelman+al:2002,Han+Boyden:2007}, which allow both measurement and control of individual neurons
modified to be light-sensitive.
The development of
\newterm[brain--machine interface]{brain--machine
interfaces}\ntindex{brain--machine
interface}~\cite{Lebedev+Nicolelis:2006} for both sensing and motor
control not only promises to restore function to disabled individuals
but also sheds light on many aspects of neural systems.
Futurists make much of these numbers, pointing to an
approaching \newtermi{singularity} at which computers reach a superhuman
level of
performance~\cite{Vinge:1993,Kurzweil:2005,Doctorow+Stross:2012}, and
then rapidly improve themselves even further.
Applying this viewpoint to humans,
the \newterm{behaviorism} movement, led by John
Watson\nindex{Watson, J.}
\newterm{Cognitive psychology}, which
views the brain as an information-processing device, can be traced
back at least to the works of William James\nindex{James,
W.} (1842--1910).
The central figure in the creation of what is now called
\newterm{control theory} was
Norbert Wiener\nindex{Wiener, N.} (1894--1964).
Wiener's book {\it
Cybernetics}~\citeyear{Wiener:1948}\newterm[cybernetics]{}
became a bestseller and awoke the public to the possibility of
artificially intelligent machines.
Ashby's {\em Design for a
Brain\/} \citeyear{Ashby:1948,Ashby:1952} elaborated on his idea that
intelligence could be created by the use of
\newtermi{homeostatic} devices containing
appropriate feedback loops to achieve stable adaptive behavior.
Modern control theory, especially the branch known as stochastic
optimal control, has as its goal the design of systems that maximize
an \newtermi{objective function} over time.
Modern linguistics and AI, then, were ``born'' at about the same time, and grew
up together, intersecting in a hybrid field called \newtermi{computational
linguistics} or \term{natural language processing}\tindex{language!processing}.
His rule, now called \newtermi{Hebbian
learning}, remains an influential model to this day.
The success of \nosystem{GPS}
and subsequent programs as models of cognition led
\citeA{Newell+Simon:1976} to formulate the famous
\newterm{physical symbol system} hypothesis, which states that ``a
physical symbol system has the necessary and sufficient means for
general intelligent action.''
In 1958, in MIT AI Lab Memo No.~1, John McCarthy defined the high-level
language \newterm{Lisp}, which was to become the dominant AI
programming language for
the next 30 years.
These limited domains
became known as \newterm[microworld]{microworlds}.
Early experiments in \newterm{machine
evolution} (now called
\newtermi{genetic programming}) \cite{Friedberg:1958,Friedberg+al:1959} were
based on the undoubtedly correct belief that by making an appropriate series of
small mutations to a machine-code program, one can generate a
program with good performance for any particular task.
Such approaches have been called \newterm[weak method]{weak
methods} because, although general, they do not scale up
to large or difficult problem instances.
In 1971, Feigenbaum and others at
Stanford began the Heuristic Programming
Project (HPP) to investigate the
extent to which the new methodology of \newterm{expert
systems} could be applied to other
areas of human expertise.
\system{Mycin} incorporated a calculus of uncertainty called
\newterm[certainty factor]{certainty factors} (see
\chapref{bayes-nets-chapter}), which seemed (at the time) to fit well
with how doctors assessed the impact of evidence on the diagnosis.
Others, following Minsky's idea
of \newterm{frames}~\citeyear{Minsky:1975},
adopted a more structured approach, assembling facts about particular object
and event types and arranging the types into a large taxonomic
hierarchy analogous to a biological taxonomy.
\subsection{The return of neural networks (1986--present)}
In the mid-1980s at least four different groups reinvented the
\newtermi{back-propagation} learning algorithm first found in 1969 by Bryson
and Ho.
These so-called \newterm{connectionist} models were seen
by some as direct competitors both to the symbolic models promoted by Newell
and Simon and to the logicist approach of McCarthy and others.
In the 1980s, approaches using
\newterm{hidden Markov models} (HMMs) came to
dominate the area.
Pearl's development of \newterm[Bayesian network]{Bayesian
networks}\tindex{Bayesian network} yielded a rigorous and efficient formalism
for representing uncertain knowledge as well as practical algorithms for
probabilistic reasoning.
The term \newtermi{deep learning} refers to machine learning using networks
with multiple layers.
They also produce an \newtermi{AI Index} at {\tt aiindex.org}
to help track progress.
{\bf Image understanding}: Not content with exceeding human accuracy on
the challenging \system{ImageNet} object recognition task, computer vision
researchers have taken on the more difficult problem of \newtermi{image
captioning}.
At present,
the research community and the major corporations involved in AI research have
developed voluntary self-governance principles for AI-related activities,
including the \newterm{Asilomar AI Principles} by the Future of Life
Institute~\citeyear{FLI:2017} and the tenets of the Partnership on
AI~\citeyear{PAI:2017}.
They called the
effort \newtermi{human-level AI} or HLAI; their first symposium was in 2004
\cite{Minsky+al:2004}.
Another effort with similar goals, the so-called AGI or
\newtermi{Artificial General Intelligence}
movement~\cite{Goertzel+Pennachin:2007}, held its first conference and
organized the {\em Journal of Artificial General Intelligence} in 2008.
We might call this the
\newtermi{gorilla problem}: about seven million years ago, a now-extinct
primate evolved, with one branch leading to gorillas and one to humans.
To pick on one
example, we might call this the \newtermi{King Midas problem}:
Midas, a legendary King on Greek mythology, asked that everything he
touched should turn to gold, but then regretted it when his food and
drink turned to gold and he died of starvation.
In modern terminology,
we call this the problem of \newtermi{value alignment}: the values or
objectives put into the machine must be aligned with those of the
human.
In \chapref{reinforcement-learning-chapter},
we explain the methods of \newtermi{inverse reinforcement learning}
that allow machines to learn more about human preferences from
observations of the choices that humans make.
An \termi{agent} is anything that can be viewed as
perceiving its
\newterm{environment} through
\newterm[sensor]{sensors} and acting upon
that environment through
\newterm[actuator]{actuators}.
We use the term
\newtermi{percept} to refer to the agent's perceptual
inputs at any given instant.
An agent's \newtermi{percept sequence}
is the complete history of everything the agent has ever perceived.
Mathematically speaking, we say that an agent's behavior
is described by the \newterm{agent function} that maps
any given percept sequence to an action.
{\em Internally}, the agent function
for an artificial agent will be implemented by an
\newterm{agent program}.
\section{Good Behavior: The Concept of Rationality}
A \newterm{rational agent} is one that does
the right thing---conceptually speaking, every entry in the table for
the agent function is filled out correctly.
Moral philosophy has developed several
different notions of the ``right thing,'' but AI has generally stuck
to one notion called \newtermi{consequentialism}: we evaluate an
agent's behavior by its consequences.
This notion of desirability is captured by a \newterm{performance
measure} that evaluates any given
sequence of environment states.
This leads to a \newterm{definition of a rational agent}\tindex{agent!rational}:
For each possible percept sequence, a rational agent should select an
action that is expected to maximize its performance measure,
given the evidence provided by the percept sequence and
whatever built-in knowledge the agent has.
Consider the simple vacuum-cleaner agent that cleans a square if it is
dirty and moves to the other square if not; this is the agent function
tabulated in \tabref{vacuum-agent-function-table}.
If the geography of the environment is unknown, the agent will
need to \newterm[exploration]{explore} it.
\subsection{Omniscience, learning, and autonomy}
We need to be careful to distinguish between rationality and
\newtermi{omniscience}.
Doing actions {\em in order to modify future
percepts}---sometimes called
\newterm{information gathering}---is an
important part of rationality and is covered in depth in
\chapref{decision-theory-chapter}.
Our definition requires a rational agent not only to gather
information but also to \newterm[learning]{learn}
as much as possible from what it perceives.
To the extent that an agent relies on the prior knowledge of its
designer rather than on its own percepts, we say that the agent lacks
\newterm{autonomy}.
A rational agent should be autonomous---it should learn what it can to
compensate for partial or incorrect prior knowledge.
First, however, we must think
about \newterm[task environment]{task environments}, which are
essentially the ``problems'' to which rational agents are the
``solutions.''
For the acronymically minded, we call this the
\newterm{PEAS} ({\bf P}erformance, {\bf E}nvironment, {\bf A}ctuators,
{\bf S}ensors) description.
Note that virtual
task environments can be just as complex as the ``real'' world:
for example, a \newterm[software agent]{software agents} (or software
robots or \newterm[softbot]{softbots}) that trades
on auction and reselling websites deals with millions of other users
and billions of objects, many with real images.
\newterm{Fully observable}
vs.~\newterm{partially observable}\ntindex{environment!partially
observable}: If an agent's sensors give it access to the complete
state of the environment at each point in time, then we say that the
task environment is fully observable.
If the agent has no sensors at all then the environment is
\newterm{unobservable}.
\newterm{Single agent} vs.~\newterm{multiagent}:
The distinction between single-agent and multiagent environments may
seem simple enough.
Thus, chess is a
\newterm{competitive}
multiagent environment.
In the taxi-driving environment, on the other
hand, avoiding collisions maximizes the performance measure of all
agents, so it is a partially \newterm{cooperative} multiagent environment.
\newterm{Deterministic}\index{deterministic
environment}
vs.~\newterm{stochastic}.
One final note: our use of the word
``stochastic'' generally implies that uncertainty about outcomes is
quantified in terms of probabilities; a
\newterm{nondeterministic} environment is one in
which actions are characterized by their {\em possible} outcomes, but
no probabilities are attached to them.
\newterm{Episodic}
vs.~\newterm{sequential}: In an episodic task environment, the agent's experience
is divided into atomic episodes.
\newterm{Static} vs.~\newterm{dynamic}:
If the environment can change while an agent is deliberating, then we
say the environment is dynamic for that agent; otherwise, it is
static.
If
the environment itself does not change with the passage of time but
the agent's performance score does, then we say the environment is
\newterm{semidynamic}.
\newterm{Discrete} vs.~\newterm{continuous}:
The discrete/continuous distinction applies to the {\em state}
of the environment, to the way {\em time} is handled, and to the {\em
percepts} and {\em actions} of the agent.
\newterm{Known} vs.~\newterm{unknown}:
Strictly speaking, this distinction refers not to the environment
itself but to the agent's (or designer's) state of knowledge about the
``laws of physics'' of the environment.
Such
experiments are often carried out not for a single environment but
for many environments drawn from an \newterm{environment
class}.
For this reason, the
code repository also includes an \newterm{environment generator} for
each environment class.
The job of AI is to design an \newterm{agent
program} that implements the agent function---the
mapping from percepts to actions.
We assume this program will run on some
sort of computing device with physical sensors and actuators---we call
this the \newterm{architecture}:
\mbox{\em agent} = \mbox{\em architecture} + \mbox{\em program}\ .
\subsection{Simple reflex agents}
The simplest kind of agent is the \newterm{simple reflex
agent}.
We call such a connection a
\newterm{condition--action
rule},\footnote{Also called
\term{situation--action rules}\tindex{rule!situation--action},
\term{productions}\tindex{production}, or \term{if--then rules}\tindex{rule!if--then}.}
Escape from infinite loops is possible if the agent can
\newterm[randomization]{randomize} its actions.
That is, the agent should maintain some sort of
\newterm{internal state} that depends on the
percept history and thereby reflects at least some of the unobserved
aspects of the current state.
An agent that
uses such models is called a \newterm{model-based
agent}.
In other words, as well
as a current state description, the agent needs some sort of
\newterm{goal} information that describes situations
that are desirable---for example, being at a particular
destination.
Because
``happy'' does not sound very scientific, economists and computer
scientists use the term \newterm{utility}
instead.\footnote{The word ``utility'' here refers to ``the quality of
being useful,'' not to the electric company or waterworks.}
An agent's \newterm{utility
function} is essentially an internalization
of the performance measure.
Technically speaking, a rational utility-based agent chooses the
action that maximizes the \newterm{expected
utility} of the action outcomes---that is, the
utility the agent expects to derive, on average, given the
probabilities and utilities of each outcome.
The most important distinction is between the
\newterm{learning element}, which is responsible for making improvements, and
the \newterm{performance element}, which
is responsible for selecting external actions.
The learning element
uses feedback from the \newterm{critic}
on how the agent is doing and determines how the performance element
should be modified to do better in the future.
The last component of the learning agent is the \newterm{problem
generator}.
In an \newterm{atomic representation}\ntindex{atomic
representation} each state of the world is
indivisible---it has no internal structure.
A
\newterm{factored representation}\ntindex{factored
representation} splits up each
state into a fixed set of
\newterm[variable]{variables} or \newterm[attribute]{attributes},
each of which can have a \newtermi{value}.
Instead, we would need a
\newterm{structured representation}\ntindex{structured
representation}, in which objects
such as cows and trucks and their various and varying relationships can be described
explicitly.
As we mentioned earlier, the axis along which atomic, factored, and structured
representations lie is the axis of increasing
\newterm{expressiveness}.
If there is a one-to-one mapping between concepts and memory locations,
we call that a \newtermi{localist representation}.
But if the representation of a concept is spread over many memory locations,
and each memory location is employed as part of the representation of multiple
different concepts, we call that a \newtermi{distributed representation} \cite{Hinton:1986}.
The concept of a
\newterm{controller} in control
theory is identical to that of an agent in AI.
It has also
infiltrated the area of operating systems, where \newtermi{autonomic
computing} refers to computer systems and networks that monitor and
control themselves with a perceive--act loop and machine learning
methods~\cite{Kephart+Chess:2003}.
Noting that a collection of agent programs designed to work well together in a true multiagent environment
necessarily exhibits modularity---the programs share no internal state and communicate with each other only through the environment---it is common within the field of
\newtermi{multiagent systems} to design the agent program of a single agent
as a collection of autonomous sub-agents.
Such an agent is
called a \newterm{problem-solving agent}, and
the process of looking for a path to a goal is called \newterm{search}.
With that information, the agent can follow
this four-phase problem-solving process:
\item \newterm{Goal formulation}: The
agent adopts the \term{goal}\tindex{goal} of reaching
Bucharest.
\item \newterm{Problem formulation}: The
agent devises a \term{model}\tindex{model} of the world that
describes the states and actions necessary to reach the goal.
\item \newterm{Search}: Before taking any action in the real world,
the agent simulates sequences of actions in its model, searching
until it finds a sequence of actions that reaches the goal.
Such a
sequence is called a \newterm{solution}.
\item \newterm{Execution}: The agent can now execute the actions
in the solution, one at a time.
Control theorists call this an
\newterm{open-loop} system: ignoring the percepts breaks the
loop between agent and environment.
If there is a chance that the model is
incorrect, or the environment is stochastic, then the agent would be safer
using a \newterm{closed-loop} approach that monitors the
percepts (see \secref{partially-observable-search-section}).
\subsection{Well-defined search problems and solutions}
A search \newterm{problem} can be defined formally as follows:
\item A set of possible \newterm{states} that the environment can be in.
We call this the \newtermi{state space}.
\item The \newterm{initial state} that the agent starts
in.
\item A set of one or more \newtermi{goal states}.
\item The \newterm{actions} available to the agent.
We say that each of these actions is
\newtermi{applicable} in \(s\).
\item A \newtermi{transition model}, that describes what each action does.
\item An \newtermi{action cost function}, \(c(s,a,s')\)\index{C@$c$ (action
cost)} that denotes the numeric cost of applying action \(a\) in state \(s\) to
reach state \(s'\).
A sequence of actions forms a
\newterm{path}, and a \term{solution}\tindex{solution} is a path
from the initial state to a goal state.
An \newtermi{optimal solution} has the lowest path cost among all
solutions.
The state space can be represented as a \newtermi{graph} in which the vertices
are states and the directed edges between them are actions.
The process of removing detail from a
representation is called \newterm{abstraction}.
Can we be more precise about the appropriate \newterm{level of abstraction}?
A \newtermi{standardized
problem}, such as the Rubik's Cube, is intended to
illustrate or exercise various problem-solving methods.
A
\newterm{real-world problem}, such as robot
navigation, is one whose solutions people actually use, and whose formulation
is idiosyncratic, not standardized, because, for example, each robot has
different sensors that produce different data.
\subsection{Standardized problems}
A \newterm{grid world} problem is a two-dimensional rectangular array of square
cells in which agents can move from cell to cell.
Another type of grid world is the \newterm{sokoban} puzzle, in which the
agent's goal is to push a number of boxes, scattered about the grid, to
designated storage locations.
In \newterm{sliding-block puzzles}, a number of
tiles (sometimes called blocks or pieces) are arranged in a grid with a number of blank spaces so
that some of the tiles can slide into the blank space.
One well-known variant
is the \newterm{8-puzzle} (see
\figref{8puzzle-figure}), which consists of a 3\(\times\)3 grid with eight
numbered tiles and one blank space.
\newterm[Touring problem]{Touring problems},
such as ``Visit every city in
\figref{romania-distances-figure}, starting and ending
in Bucharest'' are similar to route-finding problems, but each state must include not just the current location but
also the {\em set of cities the agent has visited}.
The \newterm{traveling salesperson problem}\ntindex{traveling
salesperson problem}
(TSP) is a touring
problem in which each city must be visited exactly once.
A \newterm{VLSI layout} problem
requires positioning millions of components and connections on a chip
to minimize area, minimize circuit delays, minimize stray capacitances,
and maximize manufacturing yield.
\newterm{Robot navigation}\index{problem!robot
navigation} is a generalization of the route-finding problem described earlier.
\newterm{Automatic assembly sequencing} of
complex objects (such as electric motors) by a robot has been standard industry
practice since the 1970s.
One important assembly problem is \newtermi{protein design}, in which
the goal is to find a sequence of amino acids that will fold into a
three-dimensional protein with the right properties to cure some disease.
\section{Search Algorithms}
A \newtermi{search algorithm} takes a search problem as input and returns a
solution, or an indication of failure.
In this chapter we consider algorithms
that superimpose a \newterm{search tree} over the
state-space graph, forming various paths from the initial state, trying to extend a path to reach a goal state.
We can \newterm{expand} the node,
thereby \newterm{generating} a set of three new
\newtermi{successor nodes} in the search tree.
That is, for each applicable
action, we get a resulting state ({\it Sibiu, Timisoara,} and {\it Zerind}),
for which we create a new \newterm{child node} in the
search tree, where each node has {\it Arad} as its \newtermi{parent node}.
We call this the \newtermi{frontier} of the search
tree.
We say that any state that has had a node generated for it has been
\newtermi{reached} (whether or not that node has been expanded).\footnote{Some
authors call the frontier the \termi{open list}},
which is both geographically less evocative and computationally less
appropriate, because a queue is more efficient than a list here.
Note that the frontier \newterm[separator]{separates} two regions of the
state-space graph: an interior region where every state has been expanded, and
an exterior region of states that have not yet been reached.
A very general
approach is called \newtermi{best-first search}, in which we choose the node,
\(n\), with the minimum value of some \newtermi{evaluation function}, \(f(n)\).
A \newtermi{node} in the tree is represented by a data structure with four components:
\item \var{node}.\prog{State}: the state to which the node corresponds;
\item \var{node}.\prog{Parent}: the node in the tree that generated this node;
\item \var{node}.\prog{Action}: the action that was applied to the parent's state to generate this node;
\item \var{node}.\prog{Path-Cost}: the total cost of the path from the initial
state to this node.
The appropriate
choice is a \newtermi{queue} of some kind, because the operations on a frontier
are:
\item \prog{Is-Empty}(\var{frontier}) returns true only if there are no nodes in the frontier.
Three kinds of queues are used in search algorithms:
\item A \newtermi{priority queue} pops first
the node with the minimum cost according to some evaluation function, \(f\).
\item A \newtermi{FIFO queue} or first-in-first-out queue
pops first the node that was added to the queue first; we shall see it is used
in breadth-first search.
\item A \newterm{LIFO queue} or last-in-first-out queue (also known as a \termi{stack}), which pops first the most recently added node; we shall see it is
used in depth-first search.
We say that {\it Arad} is a \newterm{repeated
state} in the search tree, generated in this case by a
\newtermi{cycle} (also known as a \newterm{loopy path}).
A cycles is a special case of a \newterm{redundant path}.
We call a search
algorithm a \newtermi{graph search} if it checks
for redundant paths and a \newtermi{tree-like search}\footnote{We say
``tree-like search'' rather than ``tree search'' because the state space is still
the same graph no matter how we search it; we are just choosing to treat it as
if it were a tree.}
We can evaluate an algorithm's
performance in four ways:
\newtermitem{Completeness}\ntindex{completeness!search algorithm@of a search
algorithm} Is the algorithm guaranteed to find a solution when there is one,
and to correctly report failure when there is not?
\newtermitem{Cost optimality} Does
it find a solution with the lowest path cost of all solutions?\footnote{Some
authors use the term ``admissibility'' for the property of finding the
lowest-cost solution, and some use just ``optimality,'' but that can be
confused with optimal efficiency.}
\newtermitem{Time complexity} How long does it take to
find a solution?
\newtermitem{Space complexity} How much memory is
needed to perform the search?
To be complete, a search algorithm must be
\newtermi{systematic} in the way it explores an infinite state space, making
sure it can eventually reaches any state that is connected to the initial
state.
For an implicit state space, complexity can be measured in
terms of \(d\), the \newterm{depth} or number of actions in an optimal
solution; \(m\), the maximum number of actions in any path; and \(b\), the
\newterm{branching factor} or number of successors of
a node that need to be considered (after eliminating redundant paths).
\subsection{Breadth-first search}
When all actions have the same cost, an appropriate strategy is
\newtermi{breadth-first search}, in which the root node is expanded first, then
all the successors of the root node are expanded next, then {\em their}
successors, and so on.
That also means we can do an \newtermi{early goal test}, checking
whether a node is a solution as soon as it is {\em generated}, rather than the
\termi{late goal test} that best-first search uses, waiting until a node is
popped off the queue.
This is called
Dijkstra's algorithm by the theoretical computer science community, and
\newtermi{uniform-cost search} by the AI community.
\subsection{Depth-first search and the problem of memory}
\newterm{Depth-first search} always expands the
{\em deepest} node in the frontier first.
A variant of depth-first search called \newterm{backtracking
search}\nindex{backtracking search}\nindex{search,
backtracking} uses even less memory.
\subsection{Depth-limited and iterative deepening search}
To keep depth-first search from wandering down an infinite path, we can use
\newterm{depth-limited search}, a version of
depth-first search in which we supply a depth limit, $\ell$, and treat all
nodes at depth $\ell$ as if they had no successors (see
\figref{recursive-dls-algorithm}).
This number, known as the \newterm{diameter}\ntindex{diameter (of a
graph)} of the state space graph, gives us a better depth
limit, which leads to a more efficient depth-limited search.
\newterm{Iterative deepening search} solves
the problem of picking a good value for $\ell$ by trying all values: first 0,
then 1, then 2, and so on---until either a solution is found, or the
depth-limited search returns the \var{failure} value rather than the
\var{cutoff} value.
An alternative approach called
\newtermi{bidirectional search} simultaneously
searches forward from the start state and backwards from the goal state(s),
keeping track of two frontiers and two tables of {\em reached} states.
\section{Informed (Heuristic) Search Strategies}
This section shows how an \newterm{informed
search} strategy---one that
uses domain-specific hints about the location of goals---can find solutions
more efficiently than an uninformed strategy.
The hints come in the form of a
\newtermi{heuristic function}, denoted
\(h(n)\):\footnote{It may seem odd that the
heuristic function operates on a node, when all it really needs is the node's
state.
\subsection{Greedy best-first search}
\newterm{Greedy best-first search} is a form
of best-first search that expands first the node with the lowest \(h(n)\)
value---the node that appears to be closest to the goal---on the grounds that
this is likely to lead to a solution quickly.
Let us see how this works for route-finding problems in Romania; we use the
\newterm{straight-line distance}\ntindex{straight-line
distance} heuristic, which we will call
\(h_{\J{SLD}}\).
\subsection{A{\star} search}
The most common informed search algorithm is \newterm{A{\star}
search} (pronounced ``A-star search''), a best-first search that uses
the evaluation function
f(n) = g(n) + h(n)
where \(g(n)\) is the path cost from the initial state to