AWS EC2 κΈ°λ° λΉ λ°μ΄ν° μ€μκ° μ²λ¦¬ μμ€ν κ΅¬μΆ μ€μ΅ κ³Όμ
FMS(Factory Management System) λΉ λ°μ΄ν° νμ΄νλΌμΈμ AWS EC2 νκ²½μμ Terraformκ³Ό Ansibleμ μ΄μ©ν΄ ꡬμΆνλ μ’ ν© μ€μ΅ νλ‘μ νΈμ λλ€. 100κ° μ₯λΉμ μΌμ λ°μ΄ν°λ₯Ό μ€μκ°μΌλ‘ μμ§, μ²λ¦¬, μ μ₯νκ³ λͺ¨λν°λ§νλ μμ ν λ°μ΄ν° νμ΄νλΌμΈμ ꡬνν©λλ€.
- λ°μ΄ν° μμ§: Python Collector + FMS API
- λ©μμ§ ν: Apache Kafka (2λ Έλ ν΄λ¬μ€ν°, EC2 κΈ°λ°)
- μ€νΈλ¦Ό μ²λ¦¬: Apache Spark Streaming (EC2 κΈ°λ°)
- λΆμ° μ μ₯: Hadoop μ μ₯μλ₯Ό AWS S3λ‘ μ§μ (EC2 κΈ°λ°)
- λͺ¨λν°λ§: Grafana + Prometheus (EC2 κΈ°λ°)
- μΈνλΌ κ΄λ¦¬: AWS EC2, Terraform, Ansible
- μ²λ¦¬λ: 47 msg/sec (λͺ©ν 50 msg/sec λλΉ 94%)
- μ§μ°μκ°: 25μ΄ μλν¬μλ (λͺ©ν 30μ΄ μ΄λ΄)
- κ°μ©μ±: 99.7% (λͺ©ν 99% μ΄μ)
- λ°μ΄ν° νμ§: 96.2% (λͺ©ν 95% μ΄μ)
μ₯ | ν΄λλͺ | μ λͺ© | ν΅μ¬ λ΄μ© | μ£Όμ μ°μΆλ¬Ό |
---|---|---|---|---|
1 | 01-pre-lab-introduction/ | Pre-Lab μκ° | νλ‘μ νΈ λͺ©ν, AWS μ€μ΅ νκ²½ κ°μ, κ³μ μ€λΉ | νκ²½ 체ν¬λ¦¬μ€νΈ |
2 | 02-aws-account-setup/ | AWS κ³μ λ° κΆν μ€λΉ | IAM, ν€νμ΄, S3 λ²ν·, VPC κΈ°λ³Έ μ€μ | κ³μ /κΆν 체ν¬λ¦¬μ€νΈ, S3 λ²ν· |
3 | 03-infra-provisioning/ | μΈνλΌ μλν(IaC) | TerraformμΌλ‘ EC2, VPC, SG, S3 λ± μλ μμ± | Terraform μ½λ, μΈνλΌ λ€μ΄μ΄κ·Έλ¨ |
4 | 04-ansible-automation/ | μλΉμ€ μλν λ°°ν¬ | Ansibleλ‘ Hadoop/Spark μλ μ€μΉ | Ansible νλ μ΄λΆ, λ°°ν¬ λ‘κ·Έ |
5 | 05-architecture-design/ | μν€ν μ² μ€κ³ λ° κ²ν | AWS κΈ°λ° λΆμ° μν€ν μ² μ€κ³, 리μ€ν¬ λΆμ | μν€ν μ² λ€μ΄μ΄κ·Έλ¨, 리μ€ν¬ λΆμ |
6 | 06-hadoop-spark-cluster/ | Hadoop/Spark ν΄λ¬μ€ν° κ΅¬μΆ | EC2 κΈ°λ° λΆμ° μ€ν 리μ§(μ μ₯μλ AWS S3 μ§μ )/μ»΄ν¨ν νκ²½ κ΅¬μΆ | ν΄λ¬μ€ν° κ΅¬μΆ μ€ν¬λ¦½νΈ, λͺ¨λν°λ§ |
7 | 07-kafka-streaming/ | Kafka μ€μκ° μ€νΈλ¦¬λ° | EC2 κΈ°λ° Kafka ν΄λ¬μ€ν°, λ°μ΄ν° μμ§κΈ° | Kafka ν΄λ¬μ€ν°, λ°μ΄ν° μμ§κΈ° |
8 | 08-data-quality-requirements/ | λ°μ΄ν° νμ§ μ건 μ μ | λ°μ΄ν° ꡬ쑰 λΆμ, νμ§ κ·μΉ, κ²μ¦ λ‘μ§ | νμ§ κ·μΉ μ μ, κ²μ¦ μ€ν¬λ¦½νΈ |
9 | 09-spark-data-transformation/ | λ°μ΄ν° λ³ν λ‘μ§ κ΅¬ν | Spark SQL, DataFrame API, UDF | λ°μ΄ν° λ³ν λͺ¨λ, νμ§ κ²μ¦κΈ° |
10 | 10-integrated-pipeline/ | ν΅ν© νμ΄νλΌμΈ κ°λ° | μλν¬μλ μ°λ, μλ¬ μ²λ¦¬ | ν΅ν© νμ΄νλΌμΈ, μλ¬ νΈλ€λ¬ |
11 | 11-visualization-monitoring/ | μκ°ν λ° λͺ¨λν°λ§ | Grafana, Prometheus, CloudWatch μ°λ | λμ보λ, λͺ¨λν°λ§ 리ν¬νΈ |
12 | 12-cost-optimization-security/ | λΉμ© μ΅μ ν λ° λ³΄μ | λΉμ© λͺ¨λν°λ§, IAM μ μ± , S3 λ²ν· μ μ± | λΉμ© 리ν¬νΈ, 보μ 체ν¬λ¦¬μ€νΈ |
13 | 13-advanced-aws-analytics/ | Glue/Athena/νμ₯ μ€μ΅ | Glue ETL, Athena 쿼리, S3 λ°μ΄ν° λ μ΄ν¬ | ETL μ€ν¬λ¦½νΈ, Athena 쿼리 μμ |
14 | 14-project-documentation/ | νλ‘μ νΈ λ¬Έμν | κΈ°μ λ¬Έμ, μ΄μ κ°μ΄λ, κ°μ κ³ν | μν€ν μ² λ¬Έμ, νΈλ¬λΈμν κ°μ΄λ |
15 | 15-presentation-feedback/ | μ±κ³Ό λ°ν λ° νΌλλ°± | νλ‘μ νΈ λ°ν, λ°λͺ¨, νκ° | λ°ν μλ£, λ°λͺ¨ μ€ν¬λ¦½νΈ |
- AWS κ³μ λ° EC2 κΆν
- Terraform, Ansible μ€μΉλ μ½μ μλ²(Oracle Linux EC2)
- μ΅μ 8GB RAM, 50GB λμ€ν¬
- Oracle Linux EC2 μΈμ€ν΄μ€μ Terraformκ³Ό Ansible μ€μΉ: https://github.com/Finfra/awsHadoop/tree/main/7.HadoopEco
- Ansibleλ‘ Hadoop sparkμ€μΉ : https://github.com/Finfra/awsHadoop/tree/main/7.HadoopEco
Terraformμ EC2 μΈμ€ν΄μ€(s1, s2, s3)λ₯Ό μλ μμ±ν©λλ€.
Ansibleμ μ΄μ©ν΄ Hadoop λ° Spark μλ μ€μΉ:
cd ../7.HadoopEco
ansible-playbook -i hadoopInstall/df/i1/ansible-hadoop/hosts hadoopInstall/df/i1/ansible-hadoop/hadoop_install.yml -e ansible_python_interpreter=/usr/bin/python3
ansible-playbook -i hadoopInstall/df/i1/ansible-spark/hosts hadoopInstall/df/i1/ansible-spark/spark_install.yml -e ansible_python_interpreter=/usr/bin/python3
EC2μμ ꡬμΆλ μλΉμ€μ μΉ UI μ μ:
- Hadoop NameNode:
http://[s1-instance-ip]:9870
- YARN ResourceManager:
http://[s1-instance-ip]:8088
- Spark Master:
http://[s1-instance-ip]:8080
- Prometheus:
http://[s1-instance-ip]:9090
ββββββββββββββββββββββββββββββββββββ
β i1 : Ansible, Terraform, kafka β
ββββββββββββββββββββββββββββββββββββ
β β β
βΌ βΌ βΌ
ββββββ ββββββ ββββββ
β s1 β β s2 β β s3 β
ββββββ ββββββ ββββββ
βββββββββββββββββββββ FMS BigData Pipeline βββββββββββββββββββββββ
β β
β [FMS API] β [Collector] β [Kafka] β [Spark] β [HDFS/s3] β
β β β β β β β
β [μΌμλ°μ΄ν°] [κ²μ¦/λ³ν] [λ²νΌλ§] [μ€μκ°μ²λ¦¬] [λΆμ°μ μ₯] β
β β β β β β β
β [10μ΄κ°κ²©] [μλ¬μ²λ¦¬] [2κ°λΈλ‘컀] [νμ§κ²μ¦] [νν°μ
λ] β
β β
β ββββββββ λͺ¨λν°λ§ κ³μΈ΅ βββββββββββββββββββββββββββββββββ β
β β [Prometheus] β [Grafana] β [AlertManager] β β
β β β β β β β
β β [λ©νΈλ¦μμ§] [μκ°ν] [μλ¦Όμ μ‘] β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
λ Έλ | μν λ° μλΉμ€ | μ£Όμ ν¬νΈ |
---|---|---|
i1 (μ½μ μλ²) | Terraform, Ansible κ΄λ¦¬ μλ² | SSH(22) |
s1 (λ§μ€ν° λ Έλ) | HDFS NameNode, YARN ResourceManager, Spark Master, Prometheus | SSH(22), NameNode UI(9870), ResourceManager UI(8088), Spark Master UI(8080), Prometheus(9090) |
s2, s3 (μ컀 λ Έλ) | HDFS DataNode, YARN NodeManager, Spark Worker, Node Exporter | SSH(22), DataNode UI(9864), NodeManager UI(8042), Spark Worker UI(8081), Node Exporter(9100) |
- λ°μ΄ν°λ FMS APIλ₯Ό ν΅ν΄ 10μ΄ κ°κ²©μΌλ‘ μμ§λ©λλ€.
- μμ€ μ£Όμ :
curl finfra.iptime.org:9872/1/
~curl finfra.iptime.org:9872/100/
(μ₯λΉ 1~100) - λ°μ΄ν° μμ§ μ
ο£Ώ ~ $ curl finfra.iptime.org:9872/1/
{"time": "2025-07-07T08:21:29Z", "DeviceId": 1, "sensor1": 85.69, "sensor2": 85.81, "sensor3": 82.15, "motor1": 1245.16, "motor2": 874.81, "motor3": 1119.36, "isFail": false}
ο£Ώ ~ $ curl finfra.iptime.org:9872/100/
{"time": "2025-07-07T08:21:32Z", "DeviceId": 100, "sensor1": 175.29, "sensor2": 84.14, "sensor3": 148.35, "motor1": 1847.49, "motor2": 146.12, "motor3": 2155.11, "isFail": false}
- λ°μ΄ν° μμ§: Python Collectorκ° FMS APIμμ 10μ΄ κ°κ²©μΌλ‘ μΌμ λ°μ΄ν° μμ§
- λ©μμ§ νμ: Kafkaκ° μμ§λ λ°μ΄ν°λ₯Ό μμ μ μΌλ‘ λ²νΌλ§ (2κ° λΈλ‘컀)
- μ€νΈλ¦Ό μ²λ¦¬: Spark Streamingμ΄ μ€μκ°μΌλ‘ λ°μ΄ν° λ³ν λ° νμ§ κ²μ¦
- λΆμ° μ μ₯: HDFSμ νν°μ κΈ°λ°μΌλ‘ λ°μ΄ν° μ μ₯ (μ₯λΉλ³, μκ°λ³)
- λͺ¨λν°λ§: Grafana, Prometheus, CloudWatchμμ μ€μκ° μκ°ν λ° μλ¦Ό
# EC2 μΈμ€ν΄μ€ μν νμΈ
aws ec2 describe-instances --filters "Name=instance-state-name,Values=running"
# Ansible νλ μ΄λΆ μ€ν λ¬Έμ
ansible-playbook --syntax-check [playbook.yml]
# μλΉμ€ λ‘κ·Έ νμΈ
ssh ec2-user@[instance-ip] 'sudo journalctl -u [service-name]'
# μλΉμ€ μ¬μμ
ssh ec2-user@[instance-ip] 'sudo systemctl restart [service-name]'
- λ©λͺ¨λ¦¬ λΆμ‘±: EC2 μΈμ€ν΄μ€ νμ μ κ·Έλ μ΄λ
- λμ€ν¬ 곡κ°: μ€λλ λ‘κ·Έ λ° λ°μ΄ν° μ 리
- λΉμ© μ΄κ³Ό: AWS λΉμ© λͺ¨λν°λ§ λ° μλ¦Ό μ€μ
- νλ‘μ νΈ μ§νμ μλ κΈ°μ μ λμ μ¬μ©ν΄λ 무방ν©λλ€.
μ€νμμ€ κΈ°μ | AWS κ΄λ¦¬ν μλΉμ€ λμ²΄μ¬ |
---|---|
Apache Spark | Amazon EMR, AWS Glue |
Apache Kafka | Amazon Kinesis Data Streams |
Prometheus | Amazon CloudWatch |
- Auto Scaling κ·Έλ£Ή λμ
- μ₯μ 볡ꡬ μλν
- IAM κΆν μΈλΆν
- μ²λ¦¬λ 3λ°° ν₯μ (150 msg/sec)
- μΊμ± λ μ΄μ΄ μΆκ°
- GPU κ°μ μ°μ° λμ
- AWS Glue ETL λμ
- Athena 쿼리 λ° λ°μ΄ν° λ μ΄ν¬ ꡬμΆ
- CloudWatch κΈ°λ° λͺ¨λν°λ§ λ° μλ¦Ό κ°ν
bigdataAwsPreLab/
βββ 01-pre-lab-introduction/
βββ 02-aws-account-setup/
βββ 03-infra-provisioning/
βββ 04-ansible-automation/
βββ 05-architecture-design/
βββ 06-hadoop-spark-cluster/
βββ 07-kafka-streaming/
βββ 08-data-quality-requirements/
βββ 09-spark-data-transformation/
βββ 10-integrated-pipeline/
βββ 11-visualization-monitoring/
βββ 12-cost-optimization-security/
βββ 13-advanced-aws-analytics/
βββ 14-project-documentation/
βββ 15-presentation-feedback/
βββ README.md
κ° ν΄λλ README.md
(μ΄λ‘ /μ€λͺ
) + src/
(μ€μ΅μ½λ/μ€ν¬λ¦½νΈ)λ‘ κ΅¬μ±λ©λλ€.
- νλ‘μ νΈ ν¬ν¬
- κΈ°λ₯ λΈλμΉ μμ± (
git checkout -b feature/AmazingFeature
) - λ³κ²½μ¬ν μ»€λ° (
git commit -m 'Add some AmazingFeature'
) - λΈλμΉμ νΈμ (
git push origin feature/AmazingFeature
) - Pull Request μμ±
μ΄ νλ‘μ νΈλ MIT λΌμ΄μ μ€ νμ λ°°ν¬λ©λλ€. μμΈν λ΄μ©μ LICENSE
νμΌμ μ°Έμ‘°νμΈμ.
- μ΄μ νΈλνΉ: GitHub Issues νμ©
- κΈ°μ λ¬Έμ: κ° μ₯λ³ README.mdμ μμΈ κ°μ΄λ μ°Έμ‘°
- κΈ΄κΈ μ§μ: κ°μ λ ΈνΈμ κ°μ¬ μ°λ½μ² νμΈ
π― μ€μ΅ λͺ©ν: μ΄λ‘ κ³Ό μ€μ΅μ ν΅ν΄ AWS κΈ°λ° λΉ λ°μ΄ν° μ€μκ° μ²λ¦¬ μμ€ν μ μμ ν κ΅¬μΆ κ²½νμ μ 곡νλ©°, νμ μμ λ°λ‘ νμ©ν μ μλ μ€λ¬΄ μλμ κΈ°λ¦ λλ€.
Happy Learning! π