Machine Learning

Sethula Ikhithi Yethuluzi Lomenzeli Yezinsizakalo Zewebhu ze-Amazon

Iyini Ikhithi yamathuluzi yomenzeli we-AWS?

iphrojekthi yomthombo ovulekile eyakhiwe yi-AWS esiza ama-ejenti wokubhala amakhodi we-AI asebenze ne-AWS ngokwethembeka kakhudlwana. Ngokungezwa kwakamuva kweseva ye-MCP esanda kukhululwa njengengxenye Yekhithi Yamathuluzi, ama-ejenti okubhala amakhodi asebenzisa I-Toolkit manje angafinyelela umongo othize we-AWS, ukuhamba komsebenzi, ama- guardrail, namathuluzi abawadingayo ukuze bakhe, bakhiphe, basuse amaphutha, futhi basebenzise amasistimu wamafu ngaphandle kokuthembela kuphela olwazini lwemodeli evamile, evamise ukungabi nasikhathi.

Esikhundleni sokucela i-ejenti yekhodi ukuthi ithuthuke kumemori, Ikhithi yamathuluzi iyinikeza imiyalelo ekhethiwe, eqondene nomsebenzi othize. Lokhu kupakishwe njengamakhono, ama-plugin, imithetho, kanye nokucushwa kweseva ye-MCP.

Amakhono amaphakethe eziyalo agxile. Baqondisa i-ejenti emisebenzini ethile ye-AWS, njengokwenza ithebula le-lakehouse ye-S3 Tables, ukuthumela uhlelo lokusebenza olungenasiphakeli, ukulungisa iphutha lokuphelelwa yisikhathi kwe-Lambda, ukuxhuma i-AWS Glue kusizindalwazi, noma ukwengeza inkumbulo kumenzeli we-AgentCore.

Ama-plugin amakhono ahlobene neqembu ndawonye. Isibonelo, i-aws-core ihlanganisa ukuthuthukiswa kwe-AWS evamile, ama-aws-ejenti amboza ukuhamba komsebenzi kwe-Bedrock AgentCore, futhi ama-aws-data-analytics amboza amathebula e-S3, Glue, Athena, ukutholwa kwedatha, nokugcinwa kwe-vector.

Amafayela emithetho asetha ukuziphatha kwe-AWS okuzenzakalelayo komenzeli. Bangatshela umenzeli ukuthi akhethe ingqalasizinda njengekhodi, bahlole imibhalo ye-AWS uma ungaqiniseki, futhi basebenzise amathuluzi e-AWS MCP uma etholakala.

Ukuhlanganiswa Kweseva ye-AWS MCP kunikeza ama-ejenti ukufinyelela kumadokhumenti e-AWS bukhoma, ama-AWS APIs, ukwenziwa kweskripthi esine-sandboxed, kanye nokucwaninga ngezilawuli zomdabu ze-AWS.

Umphumela kufanele kube izinhlelo ezingcono, ezinokuqina okwengeziwe.

Kungani Kubalulekile?

Ama-ejenti amakhodi esimanje angabhala imiyalo ebambekayo ye-AWS CLI, i-Terraform, i-CDK, izibambi ze-Lambda, imisebenzi ye-Glue, noma izinqubomgomo ze-IAM. Ngokuvamile, lezi zizoba zilungile futhi zisebenziseke ngaso leso sikhathi, kodwa kunenkinga engaba khona, futhi yindaba efanayo BONKE ama-ejenti wokubhala amakhodi abalwa nayo. Ukunqunywa kolwazi.

Lapho ama-ejenti eqeqeshwa, achayeka olwazini lwakamuva olutholakala ngaleso sikhathi, kodwa lapho amamodeli ekhishwa, lolo lwazi ngokuvamile luphelelwa isikhathi ngezinyanga eziningi. Isibonelo, imodeli yakamuva ye-OpenAI ngesikhathi sokubhala i-GPT 5.5. Ikhishwe ngasekupheleni kuka-April 2026, kodwa idatha yayo yokunqanyulwa kolwazi yayingomhla lu-1 kuZibandlela wezi-2025. Futhi phakathi nesikhathi esingenelelayo, kwethulwa amasevisi amasha, futhi amasistimu akhona, izingcingo ze-API, imibhalo, njll., ziyabuyekezwa.

Ukuthuthukiswa kwamafu kugcwele imininingwane engase ibonakale incane kodwa ingaphula izinhlelo zangempela. Isibonelo, uma udala ithebula lezibalo nge-Amazon S3 Tables, umenzeli ojwayelekile angase akhiqize isitatimende se-Athena DDL esinesigatshana esithi LOCATION ngoba le phethini ivamile kumathebula angaphandle. Kodwa nge-S3 Tables, lokho akulungile: insizakalo ilawula ukugcinwa kwetafula. Iphethini elungile ukugcina i-SQL ihlanzekile futhi kudlule ikhathalogi ye-S3 Tables ngomongo wokusebenzisa umbuzo ka-Athena.

Ikhithi yamathuluzi yomenzeli ye-AWS isiza ukugwema lolo hlobo lwephutha. Amakhono ayo aqondisa i-ejenti ukuthi:

  • Hlola ukuthi yini esevele ikhona ngaphambi kokudala izinsiza ezintsha
  • Sebenzisa ama-AWS API alungile
  • Gwema amaphethini i-AWS engawasekeli
  • Qinisekisa ukuqagela ngokumelene nemibhalo ye-AWS yamanje
  • Khiqiza izinqubomgomo ze-IAM eziqinile
  • Yenza amasheke ngemva kokwenza izinguquko
  • Landela indlela efanele yokuxazulula inkinga uma okuthile kwephuka

Lokho kubaluleke kakhulu emsebenzini we-AWS, lapho ingxenye enzima ingabhali ikhodi. Ikhodi yokubhala elingana nesevisi ethile ye-AWS, imodeli yezimvume, nendawo yokusebenza.

Ukufaka Ikhithi yamathuluzi yomenzeli kumsebenzeli wakho wokubhala amakhodi

Ikhithi yamathuluzi yomenzeli iyatholakala kuma-ejenti amaningi esimanje, njenge-Claude Code, Cursor, Kiro, ne-VS Code. Imiyalo yokufaka isampula yomenzeli ngamunye iku-repo yekhithi yamathuluzi ye-AWS, engizoyixhuma ekugcineni kwesihloko.

I-ejenti engiyithandayo yokubhala ikhodi engingayisebenzisa njengamanje yi-Codex, ngakho-ke yilokho engizokusebenzisa esibonelweni sami. Faka lokho kuqala uma ufuna ukulandela.

Ukufaka i-Toolkit ye-Codex, thayipha okulandelayo efasiteleni letheminali,

$ codex plugin marketplace add aws/agent-toolkit-for-aws

Okulandelayo, vula uhlelo lokusebenza lwe-Codex bese uthayipha okulandelayo.

/plugins

Kuye ngokuthi yini oyifakile ngaphambilini, kufanele ubone into efana nale.

AI Agents on AWS
AWS Core
AWS Data Analytics
Browser
Documents
Presentations
Spreadsheets

Lokho kusho ukuthi wonke ama-plugin adingekayo ahlobene ne-AWS ayatholakala ukuze umenzeli wakho awasebenzise. Khumbula, uma uhlangabezana nanoma yiziphi izinkinga, vele ucele umenzeli wakho wokubhala amakhodi ukuthi azilungise.

Ukusebenzisa i-Toolkit ne-ejenti yakho yokubhala ikhodi

Lena ingxenye elula, njengoba uvele utshele i-Codex ngesiNgisi lokho ofuna ukukuzuza.

Njengesibonelo sami, bengifuna:

  • Dala ithebula lama-oda we-Iceberg usebenzisa i-Amazon S3 Tables
  • Ngenisa idatha ye-oda evela kumthombo we-JDBC nge-AWS Glue
  • Qinisekisa futhi ubuze ithebula le-iceberg nge-Athena.

Lokhu kungase kubonakale njengesicelo esilula, kodwa uma usihlukanisa, kuhileleke kakhulu kunalokho ocabanga ngakho. Okokuqala, anginawo umthombo wedatha we-JDBC okhona, ngakho-ke bekufanele ngicele i-Codex ukuthi idale isizindalwazi se-RDS kuqala bese siyigcwalisa ngedatha ethile eyidumi. Lokho kukodwa kudala inqwaba yezinye izidingo, njengoba ithebula lami lesizindalwazi se-RDS liyimfihlo, bengidinga i-VPC, amaqembu okuphepha, izimvume ze-IAM, njll.

Uyalithola iphuzu, futhi lokhu kuzoba yinkinga eyaziwa kakhulu kunoma ngubani ofunda lokhu osebenzisa i-AWS ngentukuthelo. Ngisho nohlelo lwe-AWS olulula ngokuvamile ludinga ukusethwa okuyinkimbinkimbi, njengoba kufanele uphendule ngokuphepha, ukugunyazwa, nezimvume.

Kodwa njengoba uzobona, i-AWS Toolkit isenzela konke okusindayo.

Qaphela: Ikhithi yamathuluzi yomenzeli we-AWS isebenza ngaphakathi kwendawo yakho yomenzeli wokubhala amakhodi. Uma idinga ukuhlola, ukudala noma ukushintsha izinsiza ze-AWS, isebenzisa izifakazelo ze-AWS ezilungiselelwe kuleyo ndawo. Ngokuthuthukiswa kwendawo, lokho ngokuvamile kusho imininingwane ye-AWS CLI, SSO, noma okuguquguqukayo kwendawo, ngakho qiniseka ukuthi eyodwa noma enye yalezi zindlela isethiwe ngaphambi kokuthi uqale

Ukuze ngiqalise, ngacisha isicelo sami se-Codex ngase ngibhala okulandelayo:

Create an Iceberg orders table using Amazon S3 Tables, ingest order data
from a JDBC source with AWS Glue, and validate the table with Athena.
The first thing to do is set up a private JDBC source for the above with all 
required VPC, networking and security infrastructure. Populate the JDBC source 
with sample data,then everything else

Ngemva kwalokho, i-ejenti ivele… iqhubeke nayo. Nansi eminye yemiphumela eyangikhiqizela yona. Ngeke ngiwafake wonke, njengoba bekuwuhlu olude kakhulu

Worked for 32m 49s

I'll use the AWS Data Analytics skills for the JDBC source, S3 Tables/Iceberg 
table creation, ingestion, and Athena validation. I'm first checking the 
workspace shape so the implementation matches what's already here.

The existing workspace already has an aws-data-lake directory, so I'm 
inspecting it before deciding whether to extend it or create a separate 
deployment package. I also need to verify local AWS tooling and credentials 
before any real AWS resources can be created.

...
...
...

Implemented files:

orders-private-jdbc-source.yaml: VPC, private Aurora PostgreSQL source, 
NAT/S3 endpoint, security groups, Glue connection/job, IAM roles, seed data 
custom resource.

glue_orders_jdbc_to_s3tables.py: Glue 5.1 JDBC-to-S3-Tables ingestion script.
deploy-orders-end-to-end.ps1: deploys stack, uploads Glue script, grants 
Lake Formation permissions, runs Glue, validates Athena.

README.md: updated runbook and validation results.

Validation completed:

CloudFormation stack: orders-private-jdbc-source
Glue job: orders-lake-jdbc-to-s3tables-orders
Successful Glue run: jr_a5c6e33544a452a107429f0f7613d5522d8f145062fbb40ad77507529a26263c
Target table: s3tablescatalog/mytablebucket.retail.orders
Athena row count: 6
Athena date range: 2026-04-01 to 2026-05-03
Athena sample rows returned ord-1001 through ord-1005
...
...

Lonke lolu hlelo lwaluphelele futhi luthathe imizuzu engaphezudlwana kuka-30 kusukela ekuqaleni kuze kube sekugcineni.

Kodwa yasebenza?

Yebo, futhi nabu ubufakazi.

Cloudformation

Lena insizakalo esetshenziswa yi-AWS ukwenza ngokuzenzakalelayo ukudalwa kwazo zonke izinsiza ezidingekayo ukwakha isistimu ethile. Iwumthombo owodwa weqiniso mayelana nokuthi empeleni kwenziwani. Singasebenzisa i-AWS CLI ukuhlola ukuthi i-CloudFormation yenzani.

aws cloudformation describe-stacks --stack-name orders-private-jdbc-source --region us-east-2 --query "Stacks[0].StackStatus" --output text

# Output
UPDATE_COMPLETE

Singathola nohlu olugcwele lwazo zonke izinsiza nezinsiza ezakhiwe yi-CloudFormation esikhundleni sethu. Ngiwubonisile umyalo walokhu ngezansi, kodwa ngicela uqaphele ukuthi ngikwenze kusengaphambili okukhiphayo ukuze ngikwenze kufundeke kakhudlwana.

aws cloudformation list-stack-resources --stack-name orders-private-jdbc-source --region us-east-2 --output table

# Modified Output

+------------------------------------------------------+-------------------------------+-----------------+
| Service Deployed                                     | ResourceType                   | ResourceStatus  |
+------------------------------------------------------+-------------------------------+-----------------+
| S3 bucket (artifact/scripts bucket)                  | AWS::S3::Bucket                | CREATE_COMPLETE |
| Security group rule (ingress)                        | AWS::EC2::SecurityGroupIngress | CREATE_COMPLETE |
| Secrets Manager secret (DB credentials)              | AWS::SecretsManager::Secret    | CREATE_COMPLETE |
| Security group (database SG)                         | AWS::EC2::SecurityGroup        | CREATE_COMPLETE |
| RDS DB subnet group (Aurora subnets)                 | AWS::RDS::DBSubnetGroup        | CREATE_COMPLETE |
| IAM role (Glue job execution role)                   | AWS::IAM::Role                 | UPDATE_COMPLETE |
| Security group (Glue/Spark SG)                       | AWS::EC2::SecurityGroup        | CREATE_COMPLETE |
| Security group rule (egress)                         | AWS::EC2::SecurityGroupEgress  | CREATE_COMPLETE |
| Security group rule (ingress)                        | AWS::EC2::SecurityGroupIngress | CREATE_COMPLETE |
| Security group rule (egress)                         | AWS::EC2::SecurityGroupEgress  | CREATE_COMPLETE |
| Internet Gateway (VPC IGW)                           | AWS::EC2::InternetGateway      | CREATE_COMPLETE |
| IAM role (Lake Formation / S3 Tables access role)    | AWS::IAM::Role                 | CREATE_COMPLETE |
| Elastic IP (for NAT Gateway)                         | AWS::EC2::EIP                  | CREATE_COMPLETE |
| NAT Gateway                                          | AWS::EC2::NatGateway           | CREATE_COMPLETE |
| Aurora DB cluster (PostgreSQL)                       | AWS::RDS::DBCluster            | CREATE_COMPLETE |
| Aurora DB instance (writer/instance)                 | AWS::RDS::DBInstance           | CREATE_COMPLETE |
| Glue job (JDBC -> S3 Tables ingestion)               | AWS::Glue::Job                 | CREATE_COMPLETE |
| Glue JDBC connection (to Aurora/Postgres)            | AWS::Glue::Connection          | CREATE_COMPLETE |
| Route (private default route, typically to NAT)      | AWS::EC2::Route                | CREATE_COMPLETE |
| Route table (private)                                | AWS::EC2::RouteTable           | CREATE_COMPLETE |
| Subnet (private subnet 1)                            | AWS::EC2::Subnet               | CREATE_COMPLETE |
| Route table association (private subnet 1)           | AWS::EC2::SubnetRouteTableAssoc| CREATE_COMPLETE |
| Subnet (private subnet 2)                            | AWS::EC2::Subnet               | CREATE_COMPLETE |
| Route table association (private subnet 2)           | AWS::EC2::SubnetRouteTableAssoc| CREATE_COMPLETE |
| Route (public default route, typically to IGW)       | AWS::EC2::Route                | CREATE_COMPLETE |
| Route table (public)                                 | AWS::EC2::RouteTable           | CREATE_COMPLETE |
| Subnet (public subnet 1)                             | AWS::EC2::Subnet               | CREATE_COMPLETE |
| Route table association (public subnet 1)            | AWS::EC2::SubnetRouteTableAssoc| CREATE_COMPLETE |
| VPC endpoint (S3 Gateway Endpoint)                   | AWS::EC2::VPCEndpoint          | CREATE_COMPLETE |
| Custom resource (seed orders data step)              | Custom::SeedOrdersData         | CREATE_COMPLETE |
| Lambda function (seeds sample orders into DB)        | AWS::Lambda::Function          | CREATE_COMPLETE |
| IAM role (Lambda execution role for seeding)         | AWS::IAM::Role                 | CREATE_COMPLETE |
| VPC                                                  | AWS::EC2::VPC                  | CREATE_COMPLETE |
| VPC gateway attachment (attach IGW to VPC)           | AWS::EC2::VPCGatewayAttachment | CREATE_COMPLETE |
+------------------------------------------------------+-------------------------------+-----------------+

Ngeke ngidlule kuzo ZONKE izinkonzo ezidaliwe, kodwa nalu uhlu lwezibaluleke kakhulu ezinokuqinisekiswa.

I-VPC kanye nenethiwekhi

I-VPC ifana nokuba nenethiwekhi yakho encane ngaphakathi kwe-ecosystem ye-AWS. Okubekwe eceleni okuyizinkonzo ezifana namakheli e-CIDR, amathebula omzila, ama-subnet, namaqembu okuvikela, alawula ukuthi yiziphi izinsiza ezinokufinyelela ku-VPC. Ake sibone ukuthi yini eyadalwa.

aws ec2 describe-vpcs --region us-east-2 --query "Vpcs[?Tags[?Key=='aws:cloudformation:stack-name' && Value=='orders-private-jdbc-source']].[VpcId,CidrBlock]" --output table

-------------------------------------------
|              DescribeVpcs               |
+------------------------+----------------+
|  vpc-0165f765ce1af50c0 |  10.40.0.0/16  |
+------------------------+----------------+


aws ec2 describe-subnets --region us-east-2 --query "Subnets[?Tags[?Key=='aws:cloudformation:stack-name' && Value=='orders-private-jdbc-source']].[SubnetId,VpcId,CidrBlock,AvailabilityZone,MapPublicIpOnLaunch]" --output table

-----------------------------------------------------------------------------------------------
|                                       DescribeSubnets                                       |
+---------------------------+-------------------------+----------------+-------------+--------+
|  subnet-0a9e1bbeeb1e7f53d |  vpc-0165f765ce1af50c0  |  10.40.11.0/24 |  us-east-2b |  False |
|  subnet-07dc3d0e99f09cdc4 |  vpc-0165f765ce1af50c0  |  10.40.0.0/24  |  us-east-2a |  True  |
|  subnet-0c640ae5d30fe00e9 |  vpc-0165f765ce1af50c0  |  10.40.10.0/24 |  us-east-2a |  False |
+---------------------------+-------------------------+----------------+-------------+--------+


aws ec2 describe-security-groups --region us-east-2 --query "SecurityGroups[?Tags[?Key=='aws:cloudformation:stack-name' && Value=='orders-private-jdbc-source']].[GroupId,GroupName,VpcId]" --output table

--------------------------------------------------------------------------------------------------------------------
|                                              DescribeSecurityGroups                                              |
+----------------------+-----------------------------------------------------------------+-------------------------+
|  sg-0c56c3639a47dcbdb|  orders-private-jdbc-source-DatabaseSecurityGroup-ZS9C0AJXzASB  |  vpc-0165f765ce1af50c0  |
|  sg-0f1c55c20ebbf7acf|  orders-private-jdbc-source-GlueSecurityGroup-XvKHWvTsRuap      |  vpc-0165f765ce1af50c0  |
+----------------------+-----------------------------------------------------------------+-------------------------+

Izindima ze-IAM

I-Identity and Access Management (IAM) iyingxenye ebalulekile yokuphepha kwe-AWS. Ilawula ukuthi ubani nokuthi yini onokufinyelela ukuthi yiziphi izinsiza ku-AWS.

aws cloudformation list-stack-resources --stack-name orders-private-jdbc-source --region us-east-2 --query "StackResourceSummaries[?ResourceType=='AWS::IAM::Role' || ResourceType=='AWS::IAM::Policy'].[LogicalResourceId,ResourceType,PhysicalResourceId,ResourceStatus]" --output table

# Output
----------------------------------------------------------------------------------------------------------------------------------------
|                                                          ListStackResources                                                          |
+---------------------------+-----------------+--------------------------------------------------------------------+-------------------+
|  GlueJobRole              |  AWS::IAM::Role |  orders-private-jdbc-source-GlueJobRole-7cOVpk9zf1nf               |  UPDATE_COMPLETE  |
|  LakeFormationS3TablesRole|  AWS::IAM::Role |  orders-private-jdbc-sourc-LakeFormationS3TablesRole-4NKeHFJ0VwBh  |  CREATE_COMPLETE  |
|  SeedOrdersFunctionRole   |  AWS::IAM::Role |  orders-private-jdbc-source-SeedOrdersFunctionRole-LPniYBvOU4jt    |  CREATE_COMPLETE  |
+---------------------------+-----------------+--------------------------------------------------------------------+-------------------+

Siyabona ukuthi izindima ezifanele zidalwe ukuze sikwazi ukwenza ithebula le-S3, sigcwalise imininingo yethu egciniwe ye-RDS ngedatha sisebenzisa umsebenzi we-Lambda, futhi sigcwalise ithebula lethu le-S3 kusizindalwazi se-RDS sisebenzisa umsebenzi we-Glue.

RDS Database

Lokhu kudalwe njengomthombo wedatha wokuqala wetafula lethu le-iceberg ku-S3. Ngemuva kokudalwa, ithebula lesizindalwazi lafakwa i-dummy data kusetshenziswa umsebenzi we-Lambda.

Umsebenzi weLambda

Lokhu kwasetshenziselwa “ukufaka imbewu” kusizindalwazi se-RDS ngedatha eyi-dummy yokusabalalisa phambili kuthebula le-S3. Ngeke ngibonise ikhodi, kodwa umsebenzi ngokwawo wawuzungeze imigqa engama-70 yePython.

aws cloudformation list-stack-resources --stack-name orders-private-jdbc-source --region us-east-2 --query "StackResourceSummaries[?ResourceType=='AWS::Lambda::Function'].[LogicalResourceId,PhysicalResourceId,ResourceStatus]"   --output table

# Output
---------------------------------------------------------------------------------------------------------
|                                          ListStackResources                                           |
+--------------------+--------------------------------------------------------------+-------------------+
|  SeedOrdersFunction|  orders-private-jdbc-source-SeedOrdersFunction-aDuplePymvWQ  |  CREATE_COMPLETE  |
+--------------------+--------------------------------------------------------------+-------------------+

Itafula le-Iceberg ku-S3

Umgomo wethu omkhulu bekuwukukhiqiza ithebula le-iceberg ku-S3 esingalisebenzisa ukugcina, ukukhohlisa nokuhlaziya idatha. Isithombe-skrini esingezansi sibonisa idatha yethebula, efana ncamashi nedatha yomthombo kusizindalwazi se-RDS.

Athena

Akusizi ukuba nethebula le-S3 elinedatha ngaphandle uma singabuza, sibuyekeze noma sisuse leyo datha. Yilokho kanye isevisi ye-Athena ye-AWS ekuvumela ukuthi ukwenze. Ngeshwa, ngihlangabezane nenkinga lapho ngizama ukusebenzisa i-Athena ekuqaleni. Lapho ngihlola ikhonsoli ye-Athena, angikwazanga ukubona ithebula elisha le-S3 ohlwini lwemininingwane namathebula, ngakho angikwazanga ukubuza. Kodwa-ke, ngemuva kokubuyela emuva naphambili ngeCodex, ekugcineni yathola lokhu:

The key mismatch is now clear: database/table permissions are present, but 
catalog-level Lake Formation permissions are empty. AWS's Athena S3 Tables 
docs say the Athena principal also needs permissions on the S3 Tables catalog 
itself. I'm going to grant ALL on xxxxxxxxxxxx:s3tablescatalog/mytablebucket 
to the root user, then verify it.

Ngemva kwalokho, ukufinyelela kwaba kuhle.

aws athena list-data-catalogs --region us-east-2 --output table

# Output
---------------------------------------------------------
|                   ListDataCatalogs                    |
+-------------------------------------------------------+
||                 DataCatalogsSummary                 ||
|+-------------------------+-------------------+-------+|
||       CatalogName       |      Status       | Type  ||
|+-------------------------+-------------------+-------+|
||  AwsDataCatalog         |  CREATE_COMPLETE  |  GLUE ||
||  mytablebucket_s3tables |  CREATE_COMPLETE  |  GLUE ||
|+-------------------------+-------------------+-------+|

Futhi ngakwazi ukubuza itafula kukhonsoli ye-Athena.

Glue & Spark

I-Glue iyithuluzi le-AWS le-ETL futhi inezinjongo ezimbili ezibalulekile. Ingakwazi ukwenza ikhathalogi imithombo yedatha, iyenze itholakale kwezinye izinsiza ze-AWS njenge-Athena. Iglue ingaphinda isebenzise i-Spark noma i-Panda ukuze ifunde imithombo yedatha (efana nesizindalwazi se-RDS) futhi isebenzise noma iyiphi idatha eyitholayo ukuze idale futhi igcwalise izitolo zedatha kwamanye amasevisi, njengamathebula nezinto ze-S3.

aws glue get-connection --name orders-lake-orders-aurora-postgres --region us-east-2 --output json

# Output
{
    "Connection": {
        "Name": "orders-lake-orders-aurora-postgres",
        "Description": "Private Aurora PostgreSQL orders source for S3 Tables ingestion.",
        "ConnectionType": "JDBC",
        "ConnectionProperties": {
            "JDBC_ENFORCE_SSL": "false",
            "JDBC_CONNECTION_URL": "jdbc:postgresql://orders-private-jdbc-source-ordersdbcluster-wxxm5ygu3dig.cluster-chfygkamm03d.us-east-2.rds.amazonaws.com:5432/ordersdb",
            "SECRET_ID": "arn:aws:secretsmanager:us-east-2:XXXXXXXXXXXX:secret:DatabaseSecret-AYWd1SzbgdsG-3K7a0X"
        },
        "PhysicalConnectionRequirements": {
            "SubnetId": "subnet-0c640ae5d30fe00e9",
            "SecurityGroupIdList": [
                "sg-0f1c55c20ebbf7acf"
            ],
            "AvailabilityZone": "us-east-2a"
        },
        "CreationTime": "2026-05-08T21:27:26.593000+01:00",
        "LastUpdatedTime": "2026-05-08T21:27:26.593000+01:00",
        "LastUpdatedBy": "user/administrator",
        "ConnectionSchemaVersion": 1
    }
}

I-Codex iphinde yakhiqiza ikhodi ye-Spark ku-Glue ukuze ilayishe idatha kusuka kusizindalwazi se-RDS ku-Iceberg. Ngeke ngibonise yonke ikhodi, njengoba cishe imigqa eyi-100, kodwa nali amazwibela.

import sys
from datetime import datetime, timezone

import boto3
from awsglue.context import GlueContext
from awsglue.job import Job
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from pyspark.sql.functions import col, lit, to_date

args = getResolvedOptions(
    sys.argv,
    [
        "JOB_NAME",
        "connection_name",
        "source_table",
        "target_table",
        "watermark_bucket",
        "watermark_key",
    ],
)

sc = SparkContext()
...
...
...
row_count = changed_df.count()
print(f"Found {row_count} changed rows")

if row_count > 0:
    orders_df = changed_df.select(
        col("order_id").cast("string").alias("order_id"),
        col("customer_id").cast("string").alias("customer_id"),
        to_date(col("order_date")).alias("order_date"),
        col("status").cast("string").alias("status"),
        col("amount").cast("double").alias("amount"),
        col("updated_at").cast("timestamp").alias("updated_at"),
        lit(datetime.now(timezone.utc)).cast("timestamp").alias("load_timestamp"),
    )

    orders_df.writeTo(args["target_table"]).append()

    new_watermark = changed_df.agg({"updated_at": "max"}).collect()[0][0]
    s3.put_object(
        Bucket=args["watermark_bucket"],
        Key=args["watermark_key"],
        Body=str(new_watermark),
    )
    print(f"Updated watermark to {new_watermark}")
else:
    print("No new rows to ingest")

job.commit()

Okunye okucatshangelwayo uma usebenzisa ikhithi yamathuluzi ye-AWS

1/ Ikhawulela ukufinyelela kwe-ejenti kumasevisi athile e-AWS.

Iseva ye-AWS MCP yekhithi yamathuluzi isebenzisa izimvume zakho ezizenzakalelayo ze-IAM ukuze idale futhi ifinyelele izinsiza ze-AWS. Uma ufuna ukukhawulela ukufinyelela kumasevisi athile e-AWS, unezinketho ezimbalwa.

a) Okhiye ababili bomongo wesimo somhlaba bangezwa ngokuzenzakalelayo kuzo zonke izicelo ezenziwe ngeseva ye-AWS MCP:

  • aws:ViaAWSMPSservice – Setha kuqiniso kunoma yisiphi isicelo esidlula kuseva ye-MCP ephethwe yi-AWS.
  • I-aws:CalledViaAWSMCP – Iqukethe uthishanhloko wesevisi yeseva ethile ye-MCP ephethwe i-AWS (isibonelo, aws-mcp.amazonaws.com).

Ungasebenzisa laba khiye bomongo kuzinqubomgomo zakho ze-IAM ukuze uvumele noma unqabele izenzo eziqalwa nganoma iyiphi iseva ye-MCP ephethwe i-AWS. Isibonelo, ake sithi ubufuna ukunqabela iseva ye-MCP ikhono lokususa amabhakede noma izinto ze-S3. Ungasebenzisa le nqubomgomo,

{
    "Effect": "Deny",
    "Action": ["s3:DeleteBucket", "s3:DeleteObject"],
    "Resource": "*",
    "Condition": {
        "StringEquals": {
            "aws:CalledViaAWSMCP": "aws-mcp.amazonaws.com"
        }
    }
}

b) Enye inketho ukudala indima ezinikele yekhithi yamathuluzi ye-AWS. Namathisela noma yiziphi izinqubomgomo ezikhawulelwe ozifunayo kuleyo ndima, bese udala iphrofayili yayo ye-AWS CLI usebenzisa aws lungiselela umyalo.

Bese, ngaphambi kokuqala i-ejenti yakho yokubhala ikhodi (isb, i-Codex), setha i AWS_PROFILE ukuhlukahluka kwemvelo egameni lakho elisha lephrofayili ye-Codex kuphela.

2/ Ukubonwa

Ukwengamela Ikhithi Yethuluzi Lomenzeli we-AWS ngokuyinhloko kwenziwa ngeseva ye-AWS MCP, njengoba kuyisici esiphethwe esamukela amakholi amathuluzi futhi senze izenzo ze-AWS. Ngakho-ke, izinsizakalo ezimbili eziyinhloko ze-AWS ezisetshenziselwa ukuqapha yizo ezisetshenziselwa ezinye izinsiza ze-AWS – CloudWatch futhi CloudTrail.

Iseva ye-AWS MCP ishicilela ngokuzenzakalelayo amamethrikhi ku-CloudWatch endaweni yamagama ye-AWS-MCP. Ungabona:

  • Ukuncenga: ithuluzi labizwa kangaki
  • Impumelelo: amakholi amathuluzi aphumelele
  • UserError: amaphutha ohlangothini lweklayenti, ngokuvamile IAM yenqaba izenzo noma imingcele emibi
  • I-SystemError: ukwehluleka kohlangothi lweseva
  • I-throttle: izicelo ezicindezelayo

CloudTrail irekhoda izingcingo zangempela ze-AWS API ezenziwe ku-akhawunti yakho. Lapha kulapho ungabheka khona:

  • Ubani oshaye ucingo
  • Yayibizwa ngokuthini i-API
  • Lapho kwenzeka
  • Umthombo we-IP
  • Indima ecatshangwayo noma uthishanhloko we-IAM
  • Ukuthi isenzo siphumelele noma sehlulekile

Isiphetho

Uma ungunjiniyela wedatha, umakhi wedatha noma uchwepheshe we-DevOps, ukusebenzisa i-AWS Toolkit kuyisibusiso sangempela. Ngama-plug-in namathuluzi ewahlinzekayo, ungafinyelela zonke izinsiza ze-AWS namakholi angaphezu kuka-15,000 e-API.

Ngamafuphi, uma isetshenziswa ne-ejenti yokubhala ikhodi, ikhithi yamathuluzi ye-AWS ingakwazi,

  • Dala izinsiza ze-AWS, bhala ikhodi, futhi usebenzise izinhlelo zokusebenza. Ikhithi yamathuluzi isiza ukuthi ikhethe amasevisi alungile futhi ilandele imikhuba ehamba phambili ye-AWS.
  • Thola ukufinyelela kumadokhumenti e-AWS akamuva, ama-API, nemininingwane yesevisi.
  • Emisebenzini eyinkimbinkimbi efana nezinqubomgomo ze-IAM, amapayipi edatha, noma izinhlelo zokusebenza ezingenasiphakeli, umenzeli ulandela ukugeleza komsebenzi kwe-AWS okuhloliwe, okubhaliwe kunokuqagela.
  • Umenzeli wakho angasiza ukuphenya ukuthunyelwa okuhlulekile, amaphutha, noma ukwenyuka kwezindleko ngokusebenzisa amalogi e-AWS, amamethrikhi, isimo sesitaki, neziqondiso zokuxazulula inkinga.
  • Ungakwazi ukuqapha umsebenzi we-ejenti, ulawule ukufinyelela nge-IAM, futhi usethe imigqa yokuqapha njengokufinyelela kokufunda kuphela noma ukuvimbela izenzo ezithile ze-AWS.
  • Sebenza nama-ejenti amaningi okubhala amakhodi ahambisana ne-MCP, okuhlanganisa Ikhodi ye-Claude, i-Cursor, i-Codex, i-Kiro, i-Windsurf, njll.

Kodwa-ke, njengoba udaba engibhekane nalo ngohlelo lwami lokusethwa kwe-Athena lukhombisa, Ikhithi yamathuluzi, nakuba iyisilondolozi esihle kakhulu sesikhathi, ayinaphutha, ngakho-ke, njengakuyo yonke imiphumela ye-ejenti, hlola umsebenzi wakho ngaphambi kokufaka noma yini ekukhiqizeni.

Ukuze uthole ulwazi olwengeziwe ngekhithi yamathuluzi yomenzeli we-AWS, hlola i-repo esemthethweni ye-GitHub.

Source link

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button