๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ

๐“ก๐“ธ๐“ธ๐“ถ5: ๐’ฆ๐‘œ๐“‡๐‘’๐’ถ ๐’ฐ๐“ƒ๐’พ๐“‹/Artificial Intelligence(COSE361)

[์ธ๊ณต์ง€๋Šฅ] 1.1 What is AI? &2.3 The Nature of Environments

1. AI(์ธ๊ณต์ง€๋Šฅ) ์ด๋ž€?

 : ์‚ฌ๋žŒ์˜ Intelligent๋ฅผ ๋ชจ๋ฐฉํ•˜๋Š” ๊ธฐ๊ณ„๋ฅผ ๋งŒ๋“ค์–ด๋ณด์ž!

   - Thinking Humanly : ์‚ฌ๋žŒ๋‹ต๊ฒŒ ์ƒ๊ฐํ•˜๋Š”๊ฒŒ ๋ญ”๋ฐ?

   - Thinking Rationally : ๋…ผ๋ฆฌํ•™๊ณผ ๊ด€๋ จ๋จ(3๋‹จ๋…ผ๋ฒ•)

   - Acting Humanly : Turing test

   - Acting Rationally : ํ•ฉ๋ฆฌ์ ์œผ๋กœ ํ–‰๋™ํ•˜๋Š” ๊ฒƒ๊ณผ ํ•ฉ๋ฆฌ์ ์œผ๋กœ ์ƒ๊ฐํ•˜๋Š” ๊ฒƒ์€ ๋‹ค๋ฆ„ (ํ–‰๋™ํ•œ๋‹ค๊ณ  ์ƒ๊ฐํ•˜๋Š” ๊ฑด ์•„๋‹˜)

     EX) ๋„ค๋น„๊ฒŒ์ด์…˜

 

์ฆ‰, AI = Science(์ƒ๊ฐ) & Engineering(ํ–‰๋™)

 

 

 

(๊ด„ํ˜ธ์•ˆ์€ task environment for an automated taxi)

 

 

2. The Nature Of Environments

 1) Specifying the task environment

  : agent๊ฐ€ ์žˆ์„ ๋•Œ ์šฐ๋ฆฌ๋Š” PEAS description์„ specify ํ•ด์•ผํ•œ๋‹ค.

   - Performance Measure (Safe, fast, legal, comfortable trip, maximize profits) : ๋งž๋Š” ๋ฐฉํ–ฅ์œผ๋กœ ๊ฐ€๋Š”์ง€, trip time์„ ์ตœ์†Œํ™”๋กœ ํ•˜๋Š”์ง€ ๋“ฑ์˜ goal

   - Environment (Roads, other traffic, pedestrians, customers) : ์ฃผ์–ด์ง„ ํ™˜๊ฒฝ

   - Actuators (Steering, accelerator, brake, signal, horn, display) : ํ˜„์žฌ state์—์„œ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ๋Š” action

   - Sensors (Cameras, sonar, speedometer, GPS ...)  : ํ˜„์žฌ state๋ฅผ ํŒ๋‹จํ•  ์ˆ˜ ์žˆ๋Š” sensor 

 ์ด ๋„ค๊ฐ€์ง€๊ฐ€ Task environment!

 

  2) Properies of task environments

 

    i. Fully observable vs Partially observable (unobservable)

      : agents์˜ sensor๊ฐ€ ํ˜„์žฌ ํ™˜๊ฒฝ์˜ state๋ฅผ ์™„๋ฒฝํžˆ ํŒŒ์•…ํ•  ์ˆ˜ ์žˆ๋Š”์ง€ ์—†๋Š”์ง€์˜ ๋ฌธ์ œ

 

   ii. Single agent vs multiagent

      : agent์˜ ์ˆ˜. ์ฒด์Šค๋Š” competitive multiagent environment๊ณ  ํƒ์‹œ์—์„œ ์ถฉ๋Œ์„ ํ”ผํ•ด ์„ฑ๋Šฅ์„ ์ตœ๋Œ€ํ™”ํ•˜๋Š” ๊ฒƒ์€ ๋ชจ๋“  agents์™€ ํ˜‘๋ ฅ์ ์ด๋ฏ€๋กœ partially cooperative multiagent environment ์ด๋‹ค. 

 

  iii. Deterministic vs Stochastic

      : ํ˜„์žฌ state์—์„œ agent๊ฐ€ action์„ ์ˆ˜ํ–‰ํ–ˆ์„ ๋•Œ, ๊ทธ ๋‹ค์Œ state๋ฅผ ์™„๋ฒฝํ•˜๊ฒŒ ๊ฒฐ์ •ํ•  ์ˆ˜ ์žˆ๋Š”์ง€์˜ ์—ฌ๋ถ€. ๋งŒ์•ฝ agent๊ฐ€ ๋‹ค์Œ state์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค๋ฉด deterministicํ•˜๊ณ  ์•„๋‹ˆ๋ผ๋ฉด stochasticํ•˜๋‹ค. ๋งŒ์•ฝ fully observableํ•˜๋‹ค๋ฉด deterministic ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ๋ถˆํ™•์‹ค์„ฑ์— ๋Œ€ํ•ด ๊ฑฑ์ •ํ•  ํ•„์š”๊ฐ€ ์—†์ง€๋งŒ, ๋งŒ์•ฝ partially observableํ•˜๋‹ค๋ฉด stochatic ํ•  ์ˆ˜ ์žˆ๋‹ค. 

    Not fully observableํ•˜๊ฑฐ๋‚˜ not deterministicํ•œ ํ™˜๊ฒฝ์ผ ๋•Œ ์šฐ๋ฆฌ๋Š” uncertainํ•˜๋‹ค๊ณ  ๋งํ•œ๋‹ค. Stochastic์ด๋ผ๋Š” ๋‹จ์–ด๋Š” ์ผ๋ฐ˜์ ์œผ๋กœ ๊ฒฐ๊ณผ์— ๋Œ€ํ•œ ๋ถˆํ™•์‹ค์„ฑ์ด ํ™•๋ฅ ์˜ ๊ด€์ ์—์„œ ์ •๋Ÿ‰ํ™”๋˜๋Š” ๊ฒƒ์„ ๋งํ•œ๋‹ค. Nondeterministic Environment์ด๋ž€ actions์ด ๊ทธ๋“ค์˜ ๊ฐ€๋Šฅํ•œ ๊ฒฐ๊ณผ๋ฌผ์— ์˜ํ•ด ํŠน์ •์ง€์–ด์ง€์ง€๋งŒ, ๊ทธ์— ๋”ฐ๋ฅด๋Š” ํ™•๋ฅ ์ด ์—†๋Š” ํ™˜๊ฒฝ์ด๋‹ค. Nondeterministic environment description์€ ์ผ๋ฐ˜์ ์œผ๋กœ agents๊ฐ€ action์˜ ๋ชจ๋“  ๊ฐ€๋Šฅํ•œ ๊ฒฐ๊ณผ๋ฅผ ์ •์ƒ์ ์œผ๋กœ ์ฒ˜๋ฆฌํ•ด์•ผ ํ•˜๋Š” performance measures์™€ ๊ด€๋ จ์ด ์žˆ๋‹ค.

 

  iv. Episodic vs Sequential

      : episodic task environment์—์„œ agent์˜ ๊ฒฝํ—˜์€ ATOMIC EPISODES๋กœ ๋‚˜๋ˆ„์–ด์ง„๋‹ค. ๊ฐ ์—ํ”ผ์†Œ๋“œ์—์„œ agent๋Š” ์ƒํ™ฉ์„ ์ธ์‹ํ•˜๊ณ  ์–ด๋–ค single action์„ ์ˆ˜ํ–‰ํ•œ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๋‹ค์Œ episode๋Š” previous episode์—์„œ ์ทจํ•œ action์— dependํ•˜์ง€ ์•Š๋‹ค. (๋Œ€๋ถ€๋ถ„์˜ classification tasks๊ฐ€ episodicํ•˜๋‹ค) 

  sequential environment์—์„œ๋Š” ํ˜„์žฌ์˜ ๊ฒฐ์ •์ด ๋ชจ๋“  ๋‹ค์Œ ๊ฒฐ์ •๋“ค์— ์˜ํ–ฅ์„ ๋ฏธ์นœ๋‹ค. ์ฒด์Šค๋‚˜ ํƒ์‹œ ์šด์ „ ๊ฐ™์€ ๊ฒƒ๋“ค์ด sequantial ํ•˜๋‹ค. Episodic environments๊ฐ€ sequential environment๋ณด๋‹ค ๋” ๊ฐ„๋‹จํ•˜๋‹ค.

 

   v. Static vs dynamic

      : ๋งŒ์•ฝ environment๊ฐ€ agent๊ฐ€ ์ˆ˜ํ–‰๋˜๋˜ ์ค‘ ๋ณ€ํ•  ์ˆ˜ ์žˆ๋‹ค๋ฉด, ์šฐ๋ฆฌ๋Š” ์ด ๋•Œ environment๊ฐ€ dynamic(๋™์ )์ด๋ผ๊ณ  ํ•œ๋‹ค. ๋ฐ˜๋Œ€๋Š” static(์ •์ )์ด๋‹ค. Static environments๋Š” ๋น„๊ต์  ์‰ฝ๊ฒŒ ๋‹ค๋ฃฐ ์ˆ˜ ์žˆ๋Š” ๋ฐ˜๋ฉด, Dynamic environments์˜ ๊ฒฝ์šฐ์—๋Š” agent์—๊ฒŒ ๋ฌด์—‡์„ ํ•˜๊ธฐ ์›ํ•˜๋Š”์ง€ ๋Š์ž„์—†์ด ๋ฌผ์–ด๋ด์•ผ ํ•œ๋‹ค. 

  ๋งŒ์•ฝ ์‹œ๊ฐ„์˜ ํ๋ฆ„์— ๋”ฐ๋ผ ํ™˜๊ฒฝ ์ž์ฒด๋Š” ๋ณ€ํ•˜์ง€ ์•Š์ง€๋งŒ agent์˜ performance score๊ฐ€ ๋ณ€ํ™”ํ•œ๋‹ค๋ฉด ์ด๋Š” environment๊ฐ€ Semidynamicํ•˜๋‹ค๊ณ  ๋งํ•œ๋‹ค. ํƒ์‹œ ์šด์ „์€ ์™„๋ฒฝํžˆ dynamicํ•˜๊ณ , Chess๋ฅผ ์‹œ๊ฐ„์„ ๋‘๊ณ  ํ•œ๋‹ค๋ฉด ์ด๋Š” semidynamic, Crossword puzzles์€ staticํ•˜๋‹ค. 

 

   vi. Discrete vs Continuous

       : time์ด ์–ด๋–ป๊ฒŒ ๋‹ค๋ฃจ์–ด์ง€๋Š”์ง€, percepts์™€ actions์ด ์–ด๋–ค์ง€์˜ state๋ฅผ ๋ฐ˜์˜ํ•œ๋‹ค. ๋งŒ์•ฝ chess environment๊ฐ€ ์œ ํ•œํ•œ ์ˆ˜์™€ ์„œ๋กœ ๊ตฌ๋ณ„๋˜๋Š” ๋…๋ฆฝ์ ์ธ state๋ฅผ ๊ฐ–๋Š”๋‹ค๋ฉด chess๋Š” perceps์™€ actions์˜  discrete set์ด๋‹ค. ๋ฐ˜๋ฉด ํƒ์‹œ์˜ ๊ฒฝ์šฐ์—๋Š” continuous state์ด์ž continuous time problem์ด๋‹ค. 

 

   vii. Known vs Unknown

       :  ์ด๊ฒƒ์€ environment ์ž์ฒด์˜ ๊ด€ํ•œ ๊ฒƒ์ด ์•„๋‹ˆ๋ผ agent์˜ knowledge์˜ state์— ๊ด€ํ•œ ๊ฒƒ์ด๋‹ค. Known environment๋ผ๋ฉด ๋ชจ๋“  action์— ๋Œ€ํ•ด outcome์ด ์ฃผ์–ด์งˆ ๊ฒƒ์ด๊ณ , unknown ํ•˜๋‹ค๋ฉด agent๋Š” ์ข‹์€ ๊ฒฐ์ •์„ ๋‚ด๋ฆฌ๊ธฐ ์œ„ํ•ด ์–ด๋–ป๊ฒŒ ์›€์ง์—ฌ์•ผ ํ•˜๋Š”์ง€ ํ•™์Šตํ•ด๋‚˜๊ฐ€์•ผ ํ•  ๊ฒƒ์ด๋‹ค. (fully and partially observal๊ณผ๋Š” ๋‹ค๋ฅธ ๊ฐœ๋…์ด๋‹ค) ์†”๋ฆฌํ…Œ์–ด ์นด๋“œ๊ฒŒ์ž„์˜ ๊ฒฝ์šฐ์—๋Š” ๋‚˜๋Š” ๋ฃฐ์„ ์•Œ์ง€๋งŒ cards๋ฅผ ๋ณผ ์ˆ˜ ์—†์–ด known environment์ด๋ฉด์„œ partially observableํ•œ environments์ด๋‹ค. ๋˜ํ•œ ์ƒˆ๋กœ์šด ๋น„๋””์˜ค๊ฒŒ์ž„์—์„œ ๋‚˜๋Š” ๋ชจ๋“  ๊ฒŒ์ž„ state๋ฅผ ๋ณผ ์ˆ˜ ์žˆ์œผ๋‚˜ ๋‚ด๊ฐ€ ์–ด๋–ค ๋ฒ„ํŠผ์„ ๋ˆŒ๋ €์„ ๋•Œ ์–ด๋–ค ๋™์ž‘์„ ํ•˜๋Š”์ง€ ์•Œ ์ˆ˜ ์—†์–ด unkown environment์ด๋ฉด์„œ fully observableํ•˜๋‹ค. 

 

* fully observable, deterministic, discrete, and known environment ๋ผ๋ฉด ๊ทธ solution์€ fixed sequence of actions์ด๋‹ค.