Consolidating custom leaning at Amazon Sagemaker Canvas FunzingsFlows

When using Mechanical Learning Machine's flow (ML) in Amazon Sagemaker Canvas, organizations may need to process the external reliance on their specific charges. Although the SAGENAKER CANVAS provides the strongest skills and low-speed test codes, some projects that may require special dependence on unwanted libraries in Sagemaker Canvas. This post gives an example of how to enter the code that rely on external deductions in your Sagemaker travel.
Amazon Sagemaker Canvas is a Code-Code-No Code (LCNC) platform (LCNC) directing users in all categories of ML trip, from the first data correction in the management of the last models. Besides writing a single code line, users can check the datasets, change data, construction models, and generate predictions.
Sagemaker Canvas provides full skills of data to help prepare your data, including:
- More than 300 steps built in
- Free contact feature
- Regular data and cleaning activities
- Custom Code Editor supporting Python, Pyspark and Sparkql
In this post, we show how we can add a dependent dependent on Amazon Storage Service (Amazon S3 within Amazon Sagemaker Data Wrangler Flow. Using this method, you can run Spript based on the supported modules of Sagemaker Canvas.
Looking for everything
To indicate the customs of the custom text and dependability from Amazon S3 in Sagemaker Canvas, we examine the following role.
Solution Follows three advanced steps:
- Enter custom texts and dependent on Amazon S3
- Use Sagemaker Data Wrangler in Sagemaker Canvas Convert your data using the uploaded code
- Train and send a model
The next drawing is a solution structure.
In this example, we interact with two corresponding details found in Sagemaker Canvas containing the computer screen delivery information. By joining these datasets, we create a complete data capturing various postetares and delivery results. Our goal is to create a predicate model that can determine that future shipment will come when it is based on patterns and historical symbols.
Requirements
As a requirement, you need to access the Amazon S3 and Amazon Sagemaker AI. If you do not have a Sagemaker Ai Domain prepared in your account, you also need SAGENAKER AI domain.
Create data flow
To create data flow, follow these steps:
- In Amazon Sagemaker Ai Console, On the Shipping Fosse, Less Applications and Deliciousselect FabricAs shown in the following screenshot. You may need to create a Sagemaker domain if you haven't already done so.
- After your domain created, select Open Canvas.
- In Canvas, Select Data The tab and select the Logogs-Shipping-Shipping-Shipping-logs.csv, as shown in the following screenshot. After viewing first appear, select + Create data flow.
The first data flow will be opened with one source and one type of data.
- On the top right of the screen, then select Add data → tabar. Designate Datas canvas As a source and select Canvas-Sampled-Product-Procture-Proctet-Dectets.Csv.
- Designate Next As shown in the following screenshot. And select Import.
- After two additional dattasets, select the consumer sign. From the drop menu, select Select Combine data. From the next drop-down menu, select Join.
- To make an internal joint in a compound column, in the right manuser, less Join the typeselect Internal Joining. Behind Join the keysselect Unimcaid irrdivotiveAs shown in the following screenshot.
- After combined datasets, select the consumer sign. In the drop-down menu, select + Insert change. Data preview will be opened.
The data contains the XshippingDillance (long) and YsshippingDillance (long). For our purposes, we want to use the custom work that will find the full distance using X and Y and drop columns that connect. For example, we find the full distance using a job that relates to the Mpmath library.
- To call a custom activity, choose + Insert change. In the drop-down menu, select Transforming custom. Change the editor to Python (Pandas) Also try to conduct the following work from edping Python:
Running work produces the following error: ModuleNOntolontourderderer: No module of word 'MPMATH', as shown in the following screenshot.
This error arises because MPmath is not a module that is supported by Sagemaker Canvas. In order to use a function you rely on in the module, we need to be closer to using a custom activity differently.
Zip the script and dependent
In order to use a duty to be supported directly from Canvas, the customer script must be included in the Module (s). As a result of this example, we used our environmental integrated development environment (DE) creating Script.y relied on the MPMath library.
Script.py file contains two tasks: One function corresponding to Python (Pandas) Runtime (work calculate_total_distance
), and one corresponding to Python (Pyspark) Runtime (work udf_total_distance
).
To ensure that the text can run, enter the MPmath in the same cocktown in Script.py by working pip install mpmath
.
Run zip -r my_project.zip
To create a.zip file containing the function and the installation of MPMATH. The current directory is now contains a.ZIP file, our Python text, and our text submission depends on, as shown on the following screen.
Upload to Amazon S3
After creating a.zip file, download the baker S3.
After the ZIP file is loaded on Amazon S3, it is available at Sagemaker Canvas.
Run the custom text
Back to data fall in Sagemaker Canvas and replace the previous custom operating code and the following code and select Revise.
This example discusses the.zip file and adds the required dependency on the location process to get to work during the implementation. Because MPMATH is added on the area, you can now call a job depending on the foreign library.
The preceding code operates using the Python (Pandas) Runtime and the Center Work Act. To use Python (Pyspark) Runtime, update a variable Function_Name to call UDF_TOTOTOL_DISTANCE instead.
Complete the data flow
As a final step, remove incorrect column before training the model. Follow these steps:
- In Sagemaker Canvas Console, Select + Insert change. From the drop menu, select Treat the columns
- Behind Changeselect Column Drop. Behind Columns that have droppedadd productid_0, Avoid_1, and order, as shown in the following screenshot.
The last data set should contain 13 columns. Full data flow is pictured in the next picture.
Train the model
Training model, follow these steps:
- To the right top of the page, select Create a model And state your data and model.
- Designate Forecasting analysis as a type of problem and Otimeepreverever Such as the intended column, as shown on the screen below.
When a model optional model is formed to use the fastest formation or regular construction. Quick shape begins speed with accuracy and produces a trained model in less than 20 minutes. General construction prioritize accuracy with latency but model takes a long training.
Result
After model construction is completed, you can view model accuracy, and metrics such as F1, accuracy and memories. In the case of normal building, the model received 94.5% accuracy.
After exemplary training completed, there are four ways you can use your model:
- Add a model directly from Sagemaker Canvas to a teenage area
- Enter model in Sagemaker model
- Send your model to a Jobyter manual
- Send your model to Amazon QuackSight to be used in Dashboard Vializings
Clean
Managing Costs and Protecting Additional Works Costs, Select Exit To login with Sagemaker Canvas when you are finished using the app, as shown in the following screenshot. You can also configure the Sagemaker Canvas automatically close when they do nothing.
If you create a bucket of S3 specially for example, you may want to do not make your bucket off.
Summary
In this case, we showed how you can download custom leaning on Amazon S3 and compile it into Sagemaker Canvas work flow. By traveling with an effective example of using the process of calculations and Mpmath's library, indicate that:
- Package Code of Package and Depending on File of.ZIP
- Save and access this subject depends on Amazon S3
- Use Custom Data Conversion to Sagemaker Data Wrangler
- Train a guess model using modified data
This method means that data scientists and commentators can extend the Sagemaker Canvas skills above more than 300 applicable functions.
Trying to change customization yourself, refer to Amazon Sagemaker Canvas documents and log in to Sagemaker Canvas today. For more information on how you can increase your Sagemaker Canvas performance, we recommend testing these related posts:
About the writer
Nadhya Polanco Is the construction of resources associated with AWS based on Brussels, Belgium. In this passage, he supports the entities that want to include AI and the learning of the machine in their activities. In his free time, Nadhya enjoys involvement in her coffee disagreements and to explore new areas.