Skip to content

Commit 4c08555

Browse files
committed
Updated Python Data Science labs
1 parent 397c779 commit 4c08555

17 files changed

+42
-52
lines changed

Labs/AI and Machine Learning/TensorFlow/TensorFlow.md

Lines changed: 1 addition & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,6 @@ This hands-on lab includes the following exercises:
5151
Estimated time to complete this lab: **45** minutes.
5252

5353
<a name="Exercise1"></a>
54-
5554
## Exercise 1: Create an Ubuntu Data Science VM ##
5655

5756
The Data Science Virtual Machine for Linux is a virtual-machine image that simplifies getting started with data science. Multiple tools are already built, installed, and configured in order to get you up and running quickly. The NVIDIA GPU driver, [NVIDIA CUDA](https://developer.nvidia.com/cuda-downloads), and [NVIDIA CUDA Deep Neural Network](https://developer.nvidia.com/cudnn) (cuDNN) library are also included, as are [Jupyter](http://jupyter.org/), several sample Jupyter notebooks, and [TensorFlow](https://www.tensorflow.org/). All pre-installed frameworks are GPU-enabled but work on CPUs as well. In this exercise, you will create an instance of the Data Science Virtual Machine for Linux on Azure.
@@ -77,15 +76,14 @@ The Data Science Virtual Machine for Linux is a virtual-machine image that simpl
7776
Wait until the deployment is complete. It typically takes 5 minutes or less. Observe that the resource group you created contains more than just a virtual machine. It also contains a virtual disk for the VM, a storage account to hold the virtual disk, a virtual IP address, a network security group (NSG) that defines rules for inbound and outbound connections, and more. Placing Azure resources such as these in a resource group has many benefits, including the fact that you can view costs for the resource group as a whole, use role-based access control (RBAC) to restrict access to the resource group's resources, and delete all of the resources in the resource group at once by deleting the resource group itself.
7877

7978
<a name="Exercise2"></a>
80-
8179
## Exercise 2: Connect to the Data Science VM ##
8280

8381
In this exercise, you will connect remotely to the Ubuntu desktop in the VM that you created in the previous exercise. To do so, you need a client that supports [Xfce](https://xfce.org/), which is a lightweight desktop environment for Linux. For background, and for an overview of the various ways you can connect to a DSVM, see [How to access the Data Science Virtual Machine for Linux
8482
](https://docs.microsoft.com/en-us/azure/machine-learning/data-science-virtual-machine/dsvm-ubuntu-intro#how-to-access-the-data-science-virtual-machine-for-linux).
8583

8684
1. If you don't already have an Xfce client installed, download the [X2Go client](https://wiki.x2go.org/doku.php/download:start) and install it before continuing with this exercise. X2Go is a free and open-source Xfce solution that works on a variety of operating systems, including Windows and OS X. The instructions in this exercise assume you are using X2Go, but you may use any client that supports Xfce.
8785

88-
1. Click the virtual-machine resource to open it in the portal.
86+
1. Return to the Azure Portal and click the Data Science VM.
8987

9088
![Opening the virtual machine](Images/open-vm.png)
9189

@@ -126,7 +124,6 @@ In this exercise, you will connect remotely to the Ubuntu desktop in the VM that
126124
Now that you are connected, take a moment to explore the shortcuts on the desktop. These are shortcuts to the numerous data-science tools preinstalled in the VM, which include [Jupyter](http://jupyter.org/), [R Studio](https://www.rstudio.com/), and the [Microsoft Azure Storage Explorer](https://azure.microsoft.com/en-us/features/storage-explorer/), among others.
127125

128126
<a name="Exercise3"></a>
129-
130127
## Exercise 3: Train a TensorFlow model ##
131128

132129
In this exercise, you will train an image-classification model built with [TensorFlow](https://www.tensorflow.org/) to recognize images that contain hot dogs. Rather than create the model from scratch, which would require vast amounts of computing power and tens or hundreds of thousands of images, you will customize a preexisting model, a practice known as [transfer learning](https://en.wikipedia.org/wiki/Transfer_learning). Transfer learning allows you to achieve high levels of accuracy with as little as a few minutes of training time on a typical laptop or PC and as few as several dozen images.
@@ -251,7 +248,6 @@ Training the model involves little more than running a Python script that downlo
251248
The script that you executed in Step 10 specified 500 training steps, which strikes a balance between accuracy and the time required for training. If you would like, try training the model again with a higher ```how_many_training_steps``` value such as 1000 or 2000. A higher step count generally results in higher accuracy, but at the expense of increased training time. Watch out for overfitting, which, as a reminder, is represented by the difference between the orange and blue lines in TensorBoard's Scalars display.
252249
253250
<a name="Exercise4"></a>
254-
255251
## Exercise 4: Create a NotHotDog app ##
256252
257253
In this exercise, you will use [Visual Studio Code](https://code.visualstudio.com/), Microsoft's free, cross-platform source-code editor which is preinstalled in the Data Science VM, to write a NotHotDog app in Python. The app will use [Tkinter](https://wiki.python.org/moin/TkInter), which is a popular GUI framework for Python, to implement its user interface, and it will allow you to select images from your local file system. Then it will pass those images to the model you trained in the previous exercise and tell you whether they contain a hot dog.
@@ -355,7 +351,6 @@ In this exercise, you will use [Visual Studio Code](https://code.visualstudio.co
355351
Continue feeding food images into the app until you're satisfied that it can identify images containing hot dogs. Don't expect it to be right 100% of the time, but do expect it to be right *most* of the time.
356352
357353
<a name="Exercise5"></a>
358-
359354
## Exercise 5: Delete the Data Science VM ##
360355
361356
In this exercise, you will delete the resource group created in [Exercise 1](#Exercise1) when you created the Data Science VM. Deleting the resource group deletes everything in it and prevents any further charges from being incurred for it. Resource groups that are deleted can't be recovered, so be certain you're finished using it before deleting it. However, it is **important not to leave this resource group deployed any longer than necessary** because a Data Science VM is moderately expensive.
@@ -377,7 +372,6 @@ In this exercise, you will delete the resource group created in [Exercise 1](#Ex
377372
After a few minutes, the resource group and all of its resources will be deleted. Billing stops when you click **Delete**, so you're not charged for the time required to delete the resources. Similarly, billing doesn't start until the resources are fully and successfully deployed.
378373
379374
<a name="Summary"></a>
380-
381375
## Summary ##
382376
383377
The steps in this lab may be generalized to perform other types of image-classification tasks. For example, you could train the same TensorFlow model to recognize cat images or identify defective parts parts produced on an assembly line. Image classification is one of the most prevalent uses of machine learning today, and its usefulness will only increase over time. Now that you have a basis to work from, try creating some image-classification models of your own. You never know what might come of it!
Loading
Loading
Loading

Labs/Deep Learning/200 - Machine Learning in Python/1 - Ingest/readme.md

Lines changed: 19 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -51,49 +51,23 @@ The Ubuntu Data Science Virtual Machine for Linux is a virtual-machine image tha
5151

5252
1. Open the [Azure Portal](https://portal.azure.com) in your browser. If asked to log in, do so using your Microsoft account.
5353

54-
1. Click **+ Create a resource** in the menu on the left side of the portal, and then type "data science" (without quotation marks) into the search box. Select **Data Science Virtual Machine for Linux (Ubuntu)** from the results list.
54+
1. Click **+ Create a resource** in the menu on the left side of the portal, and then type "data science" into the search box. Select **Data Science Virtual Machine for Linux (Ubuntu)** from the results list.
5555

56-
![Finding the Ubuntu Data Science VM](Images/new-data-science-vm-1.png)
56+
![Finding the Ubuntu Data Science VM](Images/new-data-science-vm.png)
5757

5858
_Finding the Ubuntu Data Science VM_
5959

60-
1. Take a moment to review the list of tools included in the VM. Then click **Create**.
60+
1. Take a moment to review the list of tools included in the VM. Then click **Create** at the bottom of the blade.
6161

62-
![Creating a Data Science VM](Images/new-data-science-vm-2.png)
62+
1. Click **Create new** and enter a name for a new resource group to hold the Data Science VM. Enter a name for the VM and select the region closest to you. Click **Change size** and select **DS1_v2** as the VM size. (The default VM size is abut four times more powerful, but also costs about four times more. DS1_v2 is sufficient for the purposes of this lab, and it minimizes the cost to your Azure subscription.) Change "Authentication type" to **Password** and enter a user name and password for logging into the VM. Then click **Review + create** at the bottom of the blade.
6363

64-
_Creating a Data Science VM_
65-
66-
1. Enter a name for the virtual machine and a user name for logging into it. Set **Authentication type** to **Password** and enter a password. *Be sure to remember the user name and password that you enter*, because you will need them to access the VM. Select **Create new** under **Resource group** and enter a resource-group name such as "data-science-rg." Select the **Location** nearest you, and then click **OK**..
67-
68-
![Entering basic settings](Images/create-data-science-vm-1.png)
69-
70-
_Entering basic settings_
71-
72-
1. In the "Choose a size" blade, select **DS1_V2 Standard**, which provides a low-cost way to experiment with Data Science VMs. Then click the **Select** button at the bottom of the blade.
73-
74-
![Choosing a VM size](Images/create-data-science-vm-2.png)
75-
76-
_Choosing a VM size_
77-
78-
1. Click **OK** at the bottom of the "Settings" blade. Review the information presented to you in the "Create" blade, and then click **Create** to start the VM creation process.
64+
![Creating a Data Science VM](Images/create-data-science-vm.png)
7965

80-
![Creating the VM](Images/create-data-science-vm-3.png)
81-
82-
_Creating the VM_
83-
84-
1. Click **Resource groups** in the menu on the left side of the portal. Then click the resource group whose name you specified in Step 4.
85-
86-
![Opening the resource group](Images/open-resource-group.png)
87-
88-
_Opening the resource group_
89-
90-
1. Wait until "Deploying" changes to "Succeeded" indicating that deployment has completed. Deployment typically takes 5 minutes or less. Periodically click **Refresh** at the top of the blade to refresh the deployment status.
91-
92-
![Monitoring the deployment status](Images/deployment-succeeded.png)
66+
_Creating a Data Science VM_
9367

94-
_Monitoring the deployment status_
68+
1. Review the settings presented to you, and click **Create** at the bottom of the blade to begin deploying the VM.
9569

96-
The VM has been created. The next step is to connect to it remotely so you can work with the VM's Ubuntu desktop.
70+
Wait until the deployment is complete. It typically takes 5 minutes or less. Observe that the resource group you created contains more than just a virtual machine. It also contains a virtual disk for the VM, a storage account to hold the virtual disk, a virtual IP address, a network security group (NSG) that defines rules for inbound and outbound connections, and more. Placing Azure resources such as these in a resource group has many benefits, including the fact that you can view costs for the resource group as a whole, use role-based access control (RBAC) to restrict access to the resource group's resources, and delete all of the resources in the resource group at once by deleting the resource group itself.
9771

9872
<a name="Exercise2"></a>
9973
## Exercise 2: Connect to the Data Science VM ##
@@ -102,38 +76,40 @@ In this exercise, you will connect remotely to the Ubuntu desktop in the VM that
10276

10377
1. If you don't already have an Xfce client installed, download the [X2Go client](https://wiki.x2go.org/doku.php/download:start) and install it now. X2Go is a free and open-source Xfce solution that works on a variety of operating systems. The instructions in this exercise assume you are using X2Go, but you can use any client as long as it supports Xfce.
10478

105-
1. Return to the Azure Portal and the blade for the resource group containing the Data Science VM. Then click the VM.
79+
1. Return to the Azure Portal and click the Data Science VM.
10680

107-
![Opening the Data Science VM](Images/open-data-science-vm.png)
81+
![Opening the virtual machine](Images/open-vm.png)
10882

109-
_Opening the Data Science VM_
83+
_Opening the virtual machine_
11084

111-
1. Hover over the IP address shown for the VM and click the **Copy** button that appears to copy the IP address to the clipboard.
85+
1. Hover the cursor over the VM's public IP address and cick the **Copy** button that appears next to it to copy the IP address to the clipboard.
11286

113-
![Copying the VM's IP address](Images/copy-ip-address.png)
87+
![Copying the IP address](Images/copy-ip-address.png)
11488

115-
_Copying the VM's IP address_
89+
_Copying the IP address_
11690

117-
1. Start the X2Go client and connect to the Data Science VM at the IP address that's on the clipboard using the user name you specified in the previous exercise. Connect via port **22** (the standard port used for SSH connections), and specify **XFCE** as the session type.
91+
1. Start the X2Go client and connect to the Data Science VM using the IP address on the clipboard and the user name you specified in the previous exercise. Connect via port **22** (the standard port used for SSH connections), and specify **XFCE** as the session type. Click the **OK** button to confirm your preferences.
11892

11993
![Connecting with X2Go](Images/new-session-1.png)
12094

12195
_Connecting with X2Go_
12296

123-
1. In the **New session** panel on the right, select the resolution that you wish to use for the remote desktop. Then click the **New session** panel.
97+
1. In the "New session" panel on the right, select the resolution that you wish to use for the remote desktop. Then click **New session** at the top of the panel.
12498

12599
![Starting a new session](Images/new-session-2.png)
126100

127101
_Starting a new session_
128102

129-
1. Enter the password you specified in [Exercise 1](#Exercise1), and then click the **OK** button. If asked if you trust the host key, answer **Yes**. Also ignore any error messages saying the "SSH daemon could not be started."
103+
1. Enter the password you specified in [Exercise 1](#Exercise1), and then click the **OK** button. If asked if you trust the host key, answer **Yes**. Also ignore any error messages stating that the SSH daemon could not be started.
130104

131105
![Logging into the VM](Images/new-session-3.png)
132106

133107
_Logging into the VM_
134108

135109
1. Wait for the remote desktop to appear and confirm that it resembles the one below.
136110

111+
> If the text and icons on the desktop are too large, terminate the session. Click the icon in the lower-right corner of the "New Session" panel and select **Session preferences...** from the menu. Go to the "Input/Output" tab in the "New session" dialog and adjust the display DPI, and then start a new session. Start with 96 DPI and adjust as needed.
112+
137113
![Connected!](Images/ubuntu-desktop.png)
138114

139115
_Connected!_

Labs/Deep Learning/200 - Machine Learning in Python/4 - Visualize/readme.md

Lines changed: 22 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@ This hands-on lab includes the following exercises:
2929
- [Exercise 1: Import Matplotlib](#Exercise1)
3030
- [Exercise 2: Predict on-time arrivals](#Exercise2)
3131
- [Exercise 3: Plot predictions](#Exercise3)
32+
- [Exercise 4: Delete the Data Science VM](#Exercise4)
3233

3334
Estimated time to complete this lab: **20** minutes.
3435

@@ -195,6 +196,27 @@ In this exercise, you will combine the ```predict_delay``` function you created
195196

196197
If you are new to Matplotlib and would like to learn more about it, you will find an excellent tutorial at https://www.labri.fr/perso/nrougier/teaching/matplotlib/. There is *much* more to Matplotlib than what was shown here, which is one reason why it is so popular in the Python community.
197198

199+
<a name="Exercise4"></a>
200+
## Exercise 4: Delete the Data Science VM ##
201+
202+
In this exercise, you will delete the resource group containing the Data Science VM and the associated resources. Deleting the resource group deletes everything in it and prevents any further charges from being incurred for it. Resource groups that are deleted can't be recovered, so be certain you're finished using it before deleting it. However, it is **important not to leave this resource group deployed any longer than necessary** because a Data Science VM is moderately expensive.
203+
204+
1. Click **Resource groups** in the menu on the left side of the portal to show a list of resource groups. Then click the resource group containing the Data Science VM.
205+
206+
![Opening the resource group](Images/open-resource-group.png)
207+
208+
_Opening the resource group_
209+
210+
1. Click **Delete resource group** at the top of the blade.
211+
212+
![Deleting the resource group](Images/delete-resource-group.png)
213+
214+
_Deleting the resource group_
215+
216+
1. For safety, you are required to type in the resource group's name. (Once deleted, a resource group cannot be recovered.) Type the name of the resource group. Then click the **Delete** button to remove all traces of this lab from your Azure subscription.
217+
218+
After a few minutes, the resource group and all of its resources will be deleted. Billing stops when you click **Delete**, so you're not charged for the time required to delete the resources. Similarly, billing doesn't start until the resources are fully and successfully deployed.
219+
198220
<a name="Summary"></a>
199221
## Summary ##
200222

@@ -209,8 +231,6 @@ In four hands-on labs, you learned how to:
209231

210232
Pandas, Scikit-learn, and Matplotlib are three of the most popular Python libraries on the planet. With them, you can prepare data for use in machine learning, build sophisticated machine-learning models from the data, and chart the output. They are among dozens of tools preinstalled in Microsoft's Data Science VM, and they are just the tip of the iceberg in terms of what you can do with it. For more information, see https://docs.microsoft.com/azure/machine-learning/data-science-virtual-machine/dsvm-tools-overview.
211233

212-
Once you're finished using the Data Science VM, you should delete it so it no longer charges to your Azure subscription. To delete it and the other resources that were created along with it, simply go to the Azure Portal and delete the "data-science-rg" resource group that you created in [Lab 1](../1%20-%20Ingest). That's one of the many benefits of using resource groups: one simple action deletes the resource group and everything inside it. Once deleted, a resource group cannot be recovered, so make sure you're finished with it before deleting it.
213-
214234
---
215235

216236
Copyright 2018 Microsoft Corporation. All rights reserved. Except where otherwise noted, these materials are licensed under the terms of the MIT License. You may use them according to the license as is most appropriate for your project. The terms of this license can be found at https://opensource.org/licenses/MIT.

0 commit comments

Comments
 (0)