HiStack.net - AI & System Design Newsletter

Working at Microsoft as a Cloud Solution Architect (CSA)

Maxime Marlot — Sun, 26 Jul 2026 07:00:57 GMT

What Does a Cloud Solution Architect (CSA) Do at Microsoft?

When I saw the job offer to join Microsoft as Cloud Solution Architect, I figured I already had a rough idea of the role. In fact, I have been a solution architect for a few years before applying, so I assumed this CSA role would land close to what I knew. It did land close but with some differences that creates some confusion for the new applications.

Working as a Cloud Solution Architect at Microsoft, Google or AWS is very unique, and quite different from a traditional architect position. This post reflects my experience after few months doing that!

You want to work in the big tech industry and have some questions, just drop me a message here :)

What Is a Cloud Solution Architect at Microsoft?

As a CSA, I sit inside Microsoft’s Customer Success Unit. In short, we come in after a customer has already decided to invest in Azure, and we help them to get the best of the Microsoft solutions so they can reach their objectives. You are not the person managing the commercial relationship, although, as a customer facing role, you are accountable for both: the technical outcome and the reputation of the company.

That distinction matters. A Cloud Solution Architect is measured on whether the customer’s solution works as expected, scales, and gets adopted.

On a day to day basis, this can take many forms: it could be by advising the customers on the best way to extend their solution, review an existing architecture or clarify any points of our data & AI solutions.

Does a CSA Actually Do Technical Work?

If the customer is coming to you, it probably means they’ve already been through the documentation and the experimentation phase.

So, yes, it does. Just not in the way you pictured at first.

Many people assumed that Cloud Solution Architect will be deep inside each customer’s environment, writing the code and running the implementation himself. When you’re responsible for a whole portfolio of customers, that doesn’t scale. The hands-on delivery usually sits with the customer’s own engineers, or with a partner who specializes in exactly that kind of build.

So where does the technical part come in? A customer might come-up with a tough question because he could not find any information online, or want advice on how to optimize a workload, or ask me to review an architecture before it ships. Answering any of those well means understanding the technology at a deep level, well beyond its headline features.

You also need broad knowledge that cuts across services, because real problems rarely stay inside one of them. If I can’t reason across networking, identity, cost, and how a workload behaves under load, I’m not much use in the room. That’s the real bar for the job.

CSA vs Cloud Architect: Clearing Up the Confusion

People often mix up the CSA role with a company’s internal cloud architect. They’re related, but the day-to-day is very different.

A company’s cloud architect owns a single estate. They live inside one organization, they know its full history, and they carry its technical debt with them for years.
A Microsoft CSA works across many customers at once. Some of my colleagues see a dozen different environments in a month.

The trade is depth in one place against breadth across many. I chose breadth, and I like the variety, though some weeks I do miss watching a single system mature over time.

What a CSA Actually Does All Day at Microsoft

No two weeks are identical, but most of my time lands in a handful of buckets:

Architecture reviews: We look at what a customer has designed and we advise on some improvements, where it will break and where the costs will surprise them later.
Proofs of concept and MVPs. Sometimes, you can build a small version to settle a question with evidence instead of opinion.
Unblocking: A team gets stuck on an Azure service behaving in a way they didn’t expect. This could be related to optimization issues, security or documentation clarification in some of our data or AI products.
Driving adoption: Part of my role is helping customers genuinely use what they committed to, so their Azure investment becomes working software rather than idle spend.
Voice of the customer. When we see the same painful gap across several customers, we can carry that feedback, develop new materials, inform back the Microsoft product team so it gets better.
Subscribe now

CSA vs Solution Engineer (SE) at Microsoft

This one confuses even people inside the tech industry, because both roles are technical and both are customer-facing. The split comes down to timing.

The Solution Engineer (SE) works before the commitment. They help a customer understand what’s possible, prove the value of a platform, and reach the decision to invest. Their world is pre-sales.

The CSA picks up after that decision. Once the customer has committed, I make sure the thing they bought gets designed well and reaches production in good health. My world is delivery. An SE and a CSA often tag-team the same customer, and the handoff between us is where a lot of the good work happens.

The Skills That Matter Most

A good CSA is a translator as much as an engineer. The people who struggle here are usually brilliant technically but can’t explain a trade-off to someone who isn’t.

The technical bar is real. A few months into the company, I still feel like I’m at the very start of the learning curve. For real.

The scope is broad too, genuinely broad. Even once you’re handed a focus area, say Data & AI, two things hit you fast. That area alone spans a huge number of services, and on top of that you need a real grasp of how it all integrates, which pulls in things like security and compute.

If I can’t hold my own in a room full of the customer’s best engineers, I lose credibility in the first hour.

How to Become a Cloud Solution Architect at Microsoft

There’s no single path, but the people I see land this job tend to follow a similar shape:

Build real depth first. Most CSAs arrive with years of hands-on experience in cloud, infrastructure, data, AI or software. You want scars from having run things in production.
Get certified where it counts. I would not say that this is mandatory, but it will help you to get out of the group. Microsoft really push you to pass certification to make sure you know your business. Certifications won’t get you hired on their own, but they prove you’ve covered the ground.
Learn to explain. Practice turning a messy technical situation into a clear recommendation a business leader can act on. This skill is rarer than deep technical knowledge, and worth more.
Show customer instinct. Microsoft looks for people who genuinely care whether the customer succeeds, not only whether the architecture is elegant.

Few articles that could help you to pass Microsoft certifications: DP-600 and AI-901:

Is This Role Right for You?

If you love staying close to real technology while also working with people, this job is hard to beat. You get the variety of a consultant with the technical depth of an engineer, and you do it with the platform team standing behind you.

The pace is high, and the learning never stops, because Azure shifts under your feet every quarter.

It won’t suit everyone. If you want to own one system and polish it for years, the constant context-switching will wear you down. But if you’re curious, technically strong, and you actually enjoy helping other people succeed, being a Cloud Solution Architect in the big tech industry might be the next place you want to be in the industry.

DP-600 Practice Exam: 40 Free Exam Questions with Answers

Maxime Marlot — Wed, 22 Jul 2026 21:20:56 GMT

I passed the DP-600 in the last few weeks, and this DP-600 practice exam is the study tool I wish I’d had the week before. Every question from this mock comes from Microsoft’s published skills outline and the topics that actually came up on the day, rewritten in my own words.

DP-600: Fabric Analytics Engineer Associate Exam Questions

Last time, I wrote a practice test for the AI-901, readers said the questions felt close to the live exam, I then decided to write this practice test for the DP-600. If you are also looking to pass the AI-901, you can find it here: AI-901 Practice Exam: Free Practice Test.

⚠️ Note: These are original practice questions for the DP-600, based on my own exam experience. Microsoft refreshes this exam periodically, so use this for preparation, not as a guarantee.

The DP-600 sits behind the Microsoft Certified: Fabric Analytics Engineer Associate credential, and you need 700 out of 1000 to pass. Microsoft organizes the skills into three official areas, but data preparation carries almost half the exam, so I’ve split it in two. That leaves four buckets, and they line up with what the exam really drills:

Maintain a data analytics solution: 10 questions
Prepare and serve data: 12 questions
Query and analyze data with T-SQL, KQL, and DAX: 13 questions
Implement and manage semantic models: 5 questions

Quick tip: In my DP-600 exam, I had a lot of T-SQL questions, I highly recommend you to review your T-SQL basics so you can quickly understand the queries and not check the MS Learn Documentation (Yes, you get access to MS Learn documentation during this exam).

How this works? Substack has no quiz tool, so all 40 questions of this DP-600 mock test come first, grouped into four sections. The answer key sits at the bottom of each section.

Maintain a data analytics solution - DP-600 Practice Exam

1. Your analytics team must place all its semantic models and reports under version control that supports branching. The tenant is fully Azure based, and the solution must minimize setup and maintenance. What should you do?

A. Store the semantic models and reports in Azure Data Lake Storage Gen2
B. Connect the workspace to a GitHub repository
C. Connect the workspace to an Azure Repos repository
D. Store the semantic models and reports in OneDrive for Business

2. You manage eight workspaces that all belong to the same business area. You need to group them logically so they can be filtered together in the OneLake data hub. What should you use?

A. A workspace app
B. A domain
C. A deployment pipeline
D. A OneLake shortcut

3. Two security groups need read access to the same Lakehouse. Group 1 reads the data through the SQL analytics endpoint, and Group 2 reads it through Lakehouse Explorer. Following least privilege, which workspace roles should you assign, in that order?

A. Viewer, then Viewer
B. Viewer, then Contributor
C. Contributor, then Viewer
D. Member, then Contributor

4. (Choose THREE.) You have a deployment pipeline with development, test, and production stages, each assigned a workspace. Developers must deploy to development and test but not to production, following least privilege. Which three levels of access should you assign?

A. Build permission on the production semantic models
B. Admin access to the deployment pipeline
C. Viewer access to the development and test workspaces
D. Viewer access to the production workspace
E. Contributor access to the development and test workspaces
F. Contributor access to the production workspace

5. You plan to make bulk edits to a semantic model with the TMDL extension in Visual Studio Code, and you need report and model definitions saved as individual text files in a folder hierarchy for Git. Which file format should you save from Power BI Desktop?

A. .pbix
B. .pbit
C. .pbip
D. .pbids

6. A user named analyst1 must be able to truncate tables in the Sales schema only, and nowhere else. Following least privilege, which T-SQL statement should you run?

A. GRANT CONTROL ON SCHEMA::Sales TO analyst1
B. GRANT ALTER ON SCHEMA::Sales TO analyst1
C. GRANT EXECUTE ON SCHEMA::Sales TO analyst1
D. GRANT SELECT ON SCHEMA::Sales TO analyst1

7. (Choose TWO.) You are preparing a tenant for a proof of concept. Only the project team should be able to trial paid features and create Fabric items, following least privilege. Which two actions should you take in the Fabric admin portal?

A. Enable “Users can try Microsoft Fabric paid features” for the entire organization
B. Enable “Users can try Microsoft Fabric paid features” for specific security groups
C. Enable “Allow guest users to access Microsoft Fabric” for specific security groups
D. Enable “Users can create Fabric items” and exclude specific security groups
E. Enable “Users can create Fabric items” for specific security groups

8. You need to enable read/write access to a semantic model through the XMLA endpoint. Which settings should you modify first?

A. The semantic model settings
B. The workspace settings
C. The capacity settings
D. The tenant settings

9. Several times a day, every query against your warehouse slows down at once. You suspect Fabric is throttling the capacity. What should you use to confirm whether throttling is happening?

A. The capacity settings
B. The Monitoring hub
C. Dynamic management views (DMVs)
D. The Microsoft Fabric Capacity Metrics app

10. (Choose THREE.) Users must be able to create and publish custom Direct Lake semantic models with external tools, following least privilege. Which three actions should you include?

A. In tenant settings, enable “Allow XMLA endpoints and Analyze in Excel with on-premises datasets”
B. In tenant settings, enable “Allow guest users to access Microsoft Fabric”
C. In tenant settings, enable “Users can edit data models in the Power BI service”
D. In capacity settings, set the XMLA endpoint to Read Write
E. In tenant settings, enable “Users can create Fabric items”
F. In tenant settings, enable “Publish to web”

Answer key, Section 1: 1-C, 2-B, 3-B, 4-B and D and E, 5-C, 6-B, 7-B and E, 8-C, 9-D, 10-A and D and E

Prepare and serve data - DP-600 Practice Test

11. Your team needs a new data store for a proof of concept. The data includes semi-structured and unstructured files, and the store must support read access through both T-SQL and Spark. Which type of data store should you recommend?

A. A warehouse
B. A lakehouse
C. An eventhouse
D. An external Hive metastore

12. (Choose TWO.) You plan to query sales files with the SQL analytics endpoint of a Lakehouse. The files sit in an Amazon S3 bucket. Which two actions should you include so the files are queryable through the endpoint?

A. Create the shortcut in the Files section
B. Use the Parquet format
C. Use the CSV format
D. Create the shortcut in the Tables section
E. Use the Delta format

13. You add a Copy data activity to a pipeline to load external data into an existing Lakehouse table. The source schema changes regularly, and each run must replace both the table’s schema and all of its rows. What should you configure on the Copy data activity?

A. On the Source tab, add additional columns
B. On the Destination tab, set the table action to Overwrite
C. On the Settings tab, enable staging
D. On the Source tab, enable partition discovery

14. A subfolder in your Lakehouse holds several CSV files. You need to convert them to the Delta format with V-Order optimization enabled. What should you do from Lakehouse Explorer?

A. Use the Load to Tables feature
B. Create a shortcut in the Files section
C. Create a shortcut in the Tables section
D. Use the Optimize feature

15. Requirements state that data engineers must use low-code tools to ingest customer data into the data store. Which option should you recommend?

A. A stored procedure
B. A pipeline that contains a KQL activity
C. A Spark notebook
D. A dataflow

16. You need to make sure the data-loading activities in a workspace run in a specific sequence, one after another. The solution must minimize effort. What should you do?

A. Create a dataflow that has multiple steps and schedule the dataflow
B. Create and schedule a Spark notebook
C. Create and schedule a Spark job definition
D. Create a pipeline that has dependencies between activities and schedule the pipeline

17. A pipeline has two activities that run in sequence. You need to make sure a failure of the first activity does not stop the second one from running. Which conditional path should you configure between them?

A. Upon Failure
B. Upon Completion
C. Upon Skip
D. Upon Success

18. You are building a pipeline that must run a stored procedure returning the count of active customers, and the returned value must be available to downstream activities. Which type of activity should you add?

A. Switch
B. Copy data
C. Append variable
D. Lookup

19. Two lakehouses live in different workspaces, one with a table named dbo.sales and the other with dbo.customers. You need to reference both tables in the same SQL query without making extra copies of the data. What should you use?

A. A shortcut
B. A dataflow
C. A view
D. A managed table

20. (Choose TWO.) You need to populate a date dimension in the data store. There is no existing source for the dates, and the dimension must contain physically stored rows. Which two approaches meet the goal?

A. Populate the date dimension table by using a dataflow
B. Populate the date dimension table by using a Copy activity in a pipeline
C. Populate the date dimension view by using T-SQL
D. Populate the date dimension table by using a stored procedure activity in a pipeline

21. (Choose THREE.) You are building a customer dimension as a type 2 slowly changing dimension in a warehouse. Which three column types should you add that do not already exist in the source?

A. A foreign key
B. A natural key
C. An effective end date and time
D. A surrogate key
E. An effective start date and time

22. You need to copy a table named schema1.City into schema2, and the solution must minimize the amount of data copied. Which T-SQL statement should you run?

A. CREATE TABLE schema2.City AS SELECT * FROM schema1.City
B. SELECT * INTO schema2.City FROM schema1.City
C. CREATE TABLE schema2.City AS CLONE OF schema1.City
D. INSERT INTO schema2.City SELECT * FROM schema1.City

Answer key, Section 2: 11-B, 12-D and E, 13-B, 14-A, 15-D, 16-D, 17-B, 18-D, 19-A, 20-A and D, 21-C and D and E, 22-C

Query and analyze data - DP-600 Practice assessment

23. You have a warehouse table named staging_sales. You need a T-SQL query that returns 2023 data showing product ID and product name, and only rows whose summarized amount is greater than 10,000. Which approach is correct?

A. Group by product and filter the summarized amount with a HAVING clause
B. Filter the summarized amount with a WHERE clause
C. Filter a non-aggregated column with a HAVING clause
D. Reference the summarized amount by its column alias in the HAVING clause

24. A table named customers in the stage schema holds every update from a CRM, so there can be several rows per customer. You need to return the customer ID, name, postal code, and last updated time of the most recent row for each customer. What should you use?

A. Apply RANK() partitioned by customer ID and keep the rows where it equals 1
B. Group by customer ID and apply MAX() to every other column
C. Apply ROW_NUMBER() partitioned by customer ID and keep the rows where it equals 1
D. Use SELECT DISTINCT on customer ID

25. A row in your warehouse has three price columns: list_price, wholesale_price, and agent_price. You need a column that returns the highest of the three values for each row. Which T-SQL function should you use?

A. MAX
B. COALESCE
C. GREATEST
D. CHOOSE

26. For the same table, you need a column that returns agent_price if it exists, otherwise wholesale_price, and otherwise list_price. Which function fits this fallback logic?

A. COALESCE
B. GREATEST
C. IIF
D. CHOOSE

27. You need to add a column that returns the first day of the month for each order_date value. Which T-SQL function should you use?

A. DATEPART
B. DATEFROMPARTS
C. DATE_BUCKET
D. DATE_TRUNC

28. In a visual query, you merge two tables and need the result to include all rows from both tables. Which join type should you use?

A. Inner
B. Full outer
C. Left outer
D. Right anti

29. You need a DAX query, run through the XMLA endpoint, that returns a table of stores opened since December 1, 2023. How should you build it?

A. Start with SELECT and add a WHERE clause
B. Use CALCULATE without EVALUATE
C. Use DEFINE and EVALUATE, filtering the stores table with FILTER
D. Reference a single measure with no EVALUATE

30. You need a DAX query, run through the XMLA endpoint, that returns the total sales for the same period last year as a single value. How should you complete it?

A. SUMMARIZECOLUMNS with a WHERE clause
B. FILTER over the date table only
C. CALCULATETABLE returning a table
D. CALCULATE with SAMEPERIODLASTYEAR

31. You need to return the 10 highest-value orders, and if several orders tie at the 10th position, all of them must be included. Which clause should you use?

A. SELECT TOP 10 PERCENT ... ORDER BY order_value DESC
B. SELECT ... ORDER BY order_value DESC OFFSET 0 ROWS FETCH FIRST 10 ROWS ONLY
C. SELECT TOP (10) WITH TIES ... ORDER BY order_value DESC
D. SELECT DISTINCT TOP (10) ... ORDER BY order_value DESC

32. You need a 3-day rolling total of daily_sales, meaning each row sums the current day and the two days before it, ordered by sales_date. Which expression is correct?

A. SUM(daily_sales) OVER (ORDER BY sales_date ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)
B. SUM(daily_sales) OVER (ORDER BY sales_date RANGE BETWEEN 2 FOLLOWING AND CURRENT ROW)
C. SUM(daily_sales) OVER (PARTITION BY sales_date)
D. SUM(daily_sales) OVER (ORDER BY sales_date ROWS UNBOUNDED PRECEDING)

33. For each month, you need to show the current month’s revenue next to the previous month’s revenue in the same row. Which window function should you use?

A. LEAD(revenue) OVER (ORDER BY month)
B. FIRST_VALUE(revenue) OVER (ORDER BY month)
C. LAG(revenue) OVER (ORDER BY month)
D. ROW_NUMBER() OVER (ORDER BY month)

34. You need to return rows 21 through 40 of a result set ordered by created_date, for a paged report. Which clause should you use?

A. SELECT TOP (40) ... ORDER BY created_date
B. ORDER BY created_date OFFSET 20 ROWS FETCH NEXT 20 ROWS ONLY
C. ORDER BY created_date OFFSET 21 ROWS FETCH NEXT 40 ROWS ONLY
D. WHERE ROW_NUMBER() BETWEEN 21 AND 40

35. You need to divide customers into four equally sized groups based on total_spend, ranked from highest to lowest, so you can label each quartile. Which function should you use?

A. NTILE(4) OVER (ORDER BY total_spend DESC)
B. RANK() OVER (ORDER BY total_spend DESC)
C. PERCENT_RANK() OVER (ORDER BY total_spend DESC)
D. ROW_NUMBER() OVER (ORDER BY total_spend DESC)

Answer key, Section 3: 23-A, 24-C, 25-C, 26-A, 27-D, 28-B, 29-C, 30-D, 31-C, 32-A, 33-C, 34-B, 35-A

Implement and manage semantic models - DP-600 Practice Exam

36. You create a Direct Lake semantic model over the Delta tables of a warehouse, and that warehouse uses row-level security. When users interact with a report built on the model, which mode do the DAX queries use?

A. Direct Lake
B. Dual
C. Import
D. DirectQuery

37. (Choose TWO.) A DirectQuery semantic model queries a source with 500 million rows, and a report built on it has slow visuals across several pages. Which two features can you use to reduce query execution time?

A. User-defined aggregations
B. Automatic aggregations
C. Query caching
D. OneLake integration

38. You have a custom Direct Lake semantic model with 1 billion rows, and you connect to it with Tabular Editor through the XMLA endpoint. You need to make sure user queries always use Direct Lake mode and never fall back. What should you do?

A. From Model, configure the default mode option
B. From Partitions, configure the mode option
C. From Model, configure the storage location option
D. From Model, configure the Direct Lake behavior option

39. (Choose TWO.) An import-mode model contains an Orders table with 100 million rows. You need to reduce both the memory the model uses and the time it takes to refresh. Which two actions should you perform?

A. Split OrderDateTime into separate date and time columns
B. Replace TotalQuantity with a calculated column
C. Convert Quantity to the text data type
D. Replace TotalSalesAmount with a measure

40. You are building a calculation group and need a calculation item that shifts the selected date context to month-to-date. How should you complete the DAX expression?

A. GENERATE with SELECTEDVALUE
B. CALCULATE with SELECTEDMEASURE and DATESMTD
C. COMBINEVALUES with SELECTEDMEASURE
D. FILTER with DATESMTD

Answer key, Section 4: 36-D, 37-A and B, 38-D, 39-A and D, 40-B

What Is FUAM? Fabric Unified Admin Monitoring Guide

Maxime Marlot — Tue, 21 Jul 2026 19:32:06 GMT

If you run Microsoft Fabric at any real scale, you already know the monitoring story is scattered. Capacity sits in one app, activity logs live behind an admin API, workspace inventory hides in the Scanner API, and tenant settings are their own screen. FUAM, short for Fabric Unified Admin Monitoring, is the open-source answer to that.

FUAM collects those feeds into a single Lakehouse on your own capacity and puts Power BI reports on top.

FUAM is a community-built accelerator that pulls your whole Fabric tenant into one Lakehouse.

This is a practitioner’s look at what the tool does, where its data comes from, how it stacks up against the first-party Capacity Metrics app, and the part most write-ups skip: what it costs in real capacity units.

What is FUAM, and why it exists

FUAM is a solution accelerator, not a product. It started as a community project and now lives inside Microsoft’s fabric-toolbox repository under an MIT license, where it’s actively developed. That distinction matters. There’s no SLA, no support queue, and no promise that a given release won’t break. If something goes wrong, you file a GitHub issue and wait.

So why run an unsupported tool? Because the native pieces were never built to talk to each other. The Capacity Metrics app watches CUs, the admin monitoring workspace covers a slice, the Scanner API returns inventory as raw JSON, and none of them share a model. It exists to stitch those feeds together and keep the history, so a platform team can answer tenant-wide questions from one place.

Everything is assembled from Fabric items you already recognize. Pipelines and notebooks handle extraction and transformation, a Lakehouse stores the results in Delta Parquet, and two semantic models feed the reports in Direct Lake mode. Because the raw tables sit in your own Lakehouse, you can query them straight from the SQL endpoint or build your own model next to the ones that ship.

Fabric Unified Admin Monitoring Orchestration and Pipelines

What it monitors by FUAM: capacity, activity, and inventory

One orchestration pipeline, Load_FUAM_Data_E2E, drives the whole collection process and handles both the first full load and later incremental runs. Beneath it sit a set of modules, each responsible for one slice of the tenant:

Capacities — capacity properties and the users assigned to them.
Capacity Metrics — CU consumption by timepoint and by item kind per day, pulled from the Capacity Metrics app’s semantic model over XMLA.
Activities — the tenant activity log of user actions, plus a rolling 30-day aggregate.
Inventory — a full tenant scan via the Scanner API: semantic models, lakehouses, warehouses, reports, and the rest of your items.
Tenant settings — snapshots of tenant and delegated settings, so you can see what changed and when.
Workspaces, refreshables, and Git connections — which workspaces exist (personal ones excluded), what’s scheduled to refresh, and which workspaces are wired to source control.

That combination is what turns it into Fabric tenant monitoring rather than plain capacity watching. You get Fabric capacity utilization monitoring sitting in the same model as your inventory and activity history, which is exactly the join you can’t easily make when every source lives in a separate tool.

A newer optimization module (still beta) goes a layer deeper, pulling Best Practice Analyzer and VertiPaq Analyzer results for semantic models. Near-real-time CU tracking through Capacity Utilization Events is on the roadmap but hasn’t shipped.

What you can actually do with FUAM

Collection is only half the point. The reason to stand this up is the questions it lets you answer across a whole tenant instead of one capacity at a time. A few things the standard reports and model make straightforward:

Track long-term CU utilization well beyond the roughly 14 days the Capacity Metrics app keeps, so you see seasonality and growth instead of just the last two weeks.
Rank the items driving consumption. The item-level report and model let you sort semantic models, reports, and pipelines by the load they put on a capacity.
Spot orphaned content by cross-referencing inventory against activity, which is how you find the fifty workspaces nobody has opened in months.
Audit tenant settings over time and trace who’s been active where.
Extend any of it. Since the Lakehouse is yours, you can join FUAM’s tables to your own chargeback data or a CMDB and build reports the shipped ones don’t cover.

That last point is the real draw for teams with a data engineer on hand. The bundled reports are a starting layer, and the capabilities you actually want tend to be the ones you build on top of the raw tables.

FUAM vs the Fabric Capacity Metrics app

This is the comparison people ask for most, and the usual framing is off. These two aren’t competitors. One depends on the other.

FUAM VS Fabric Capacity Metric App

The Capacity Metrics app

First-party, Microsoft-supported, installed from AppSource.
Focused on capacity: CU usage, throttling, overloads, interactive versus background operations.
Short retention (a rolling window of roughly 14 days) and a fixed model you don’t really customize.
The right tool for “is my capacity throttling right now, and why.”

Fabric Unified Admin Monitoring

A community accelerator you deploy and own, with no official support.
Tenant-wide: capacity plus inventory, activities, tenant settings, refreshables, and Git connections in one model.
Long history, kept as long as you keep loading it, and fully open to extend.
The right tool for “what’s happening across my whole estate over time.”

Here’s the catch worth stating plainly. FUAM reads its capacity numbers from the Capacity Metrics app’s semantic model through the XMLA endpoint, so you need that app installed and a compatible build (the docs call out specific versions such as v65, v53, v47, and v44 or earlier). Because that extraction rides on the app’s internal model, a change on Microsoft’s side can break the feed without warning. Keep the app, and treat it as the layer that unifies everything around it.

What FUAM costs to run

The software is free. There’s no license and no per-seat charge. The cost is capacity, because every pipeline run, notebook execution, and report query burns CUs on the Fabric capacity you deploy it to, plus a small OneLake storage footprint for the Delta tables.

You need a P or F SKU to run it. PPU and Pro shared workspaces aren’t supported, and deployment also needs a service principal, Fabric admin rights, and the XMLA endpoint enabled on the Capacity Metrics app.

So what does it actually consume? Real-world figures from practitioners running it on an F64 land in a fairly consistent range. Treat these as community reports, not official Microsoft numbers:

Steady state stays low, often around 2 to 5% of capacity over time.
A single scheduled run tends to sit near 10% of CU while the pipeline is active.
The initial backfill is the heaviest moment and can touch 10 to 15% for a short window, depending on how much history you pull and how many workspaces you scan.
Storage is negligible next to compute.

Consumption scales with metadata volume, not raw data size, so more workspaces and more activity mean longer notebook runs. Two things bite people. Over-scheduling is the first. Running the pipeline every few minutes is rarely worth it, and every 30 to 60 minutes (or even twice a day) covers most admin needs. The shipped semantic model is the second. It isn’t tuned well, and at least one team reported a single report filter nearly pushing an F64 into throttling, which is a strong argument for building your own model on the tables.

Takeaway: FUAM is free to license but not free to run. Budget a few percent of a mid-size capacity for steady state, schedule it off-peak, and load incrementally after the first backfill.

Subscribe now

FUAM Limitations

This earns its place when a single capacity app stops being enough: multiple capacities, a growing tenant, governance questions that need history, or a platform team that wants everything queryable in one Lakehouse. For one small capacity, the native app plus built-in workspace monitoring may be all you ever need.

Know the limits before you commit:

No official support. It’s community-maintained, so budget time to read code and follow releases.
Not real-time. Collection is batch, and freshness follows your schedule. Near-real-time CU events are still on the roadmap.
It leans on the Capacity Metrics app. Version drift there can break the capacity feed.
The default model needs work. Plan to tune or replace it before you point many users at it.
It competes for the capacity it measures. A common recommendation is to give the tool and the Capacity Metrics app their own small capacity, so a production overload doesn’t also blind the thing you use to diagnose it.

Deployed with those constraints in mind, Fabric Unified Admin Monitoring turns a pile of disconnected admin surfaces into one model you control. For anyone running Fabric past a single capacity, that consolidation is worth the setup, as long as you go in treating it as an accelerator you own rather than a product someone else supports.

Power BI Licensing Explained: Free vs Pro vs PPU vs Fabric Capacity

Maxime Marlot — Thu, 16 Jul 2026 19:14:37 GMT

Power BI is one of the most powerful tools in the data world, and it has become the default choice for analytics teams at companies of every size. Millions of data analysts, developers, and business users rely on Power BI every day to turn raw data into reports, dashboards, and decisions.

Power BI licensing trips up almost everyone the first time they try to share a report. The confusing part is that Power BI licenses come in two completely different shapes: most are billed per person, and one is billed per unit of compute.

Compare Power BI licenses: Free, Pro, PPU, and Fabric capacity

Buy the wrong one and you either can’t share your work or you sign a five-figure contract you never needed.

This guide breaks down the options that matter in 2026 (Free, Pro, Premium Per User, and Fabric capacity), plus the legacy Premium capacity SKU you’ll still see referenced, what each one includes, and how to choose the right Power BI license.

The five Power BI licenses compared

Here is the short version before the detail. Prices are US list, billed annually.

Power BI Free: $0 per user. Build reports in Power BI Desktop and save them to your own workspace. You can’t share with colleagues the normal way.
Power BI Pro: $14 per user per month. Publish, share, and collaborate. Everyone who opens your report also needs Pro.
Power BI Premium Per User (PPU): $24 per user per month. Everything in Pro plus enterprise features like large models and paginated reports. Authors and viewers both need PPU.
Power BI Premium capacity (P SKU): legacy dedicated capacity, now being replaced by Fabric. New customers buy Fabric instead.
Microsoft Fabric capacity (F SKU): dedicated compute billed by the hour, from about $263 per month (F2) up to roughly $8,400 per month (F64). At F64 and above, viewers read Power BI content with a free license.

Power BI Licensing Pricing and Features: Free vs Pro vs PPU vs Fabric Capacity

What each of the Power BI licenses lets you do

The Power BI licenses differ on two axes: the features you get, and who has to hold a paid seat. Keep both in mind as you read.

Power BI Free License

A Power BI free license lets one person do real work. You can connect to data, model it, build reports in Power BI Desktop, and publish to your personal My Workspace. The wall you hit is sharing.

A free user can’t hand a report to a coworker or publish an app that others open, with one exception I’ll come back to under Fabric.

For a solo analyst prototyping something, Power BI free licenses are genuinely useful. For a team that needs to collaborate, they’re a dead end.

Power BI Pro License

Power BI Pro is the license most organizations run on. It lets you publish to shared workspaces, collaborate with other Pro users, distribute apps, and schedule data refresh. The part that surprises people: sharing is symmetric. If you hold Pro and your audience doesn’t, they still can’t open the report. In a pure Pro setup, every author and every viewer needs their own $14 seat. Fifty report consumers means fifty seats.

Power BI Premium Per User License

PPU sits between Pro and full capacity. A Power BI Premium Per User license costs $24 per month and adds the features data teams ask for once they outgrow Pro: semantic models up to 100 GB, paginated reports, the XMLA endpoint for external tools, deployment pipelines, and refresh up to 48 times a day.

The same symmetry rule applies here. Content published to a PPU workspace can only be opened by other PPU users, so you can’t pair a few PPU authors with hundreds of Pro viewers. That’s precisely why PPU fits small analyst teams and stops making sense once your audience grows.

Premium capacity and Fabric F-SKU

Here is where per-user pricing ends. A Power BI Premium license in the old model meant a P SKU, a block of dedicated capacity your reports run on instead of the shared pool. Microsoft has folded that into Microsoft Fabric, the P SKUs are being retired, and new purchases are Fabric F SKUs.

A Microsoft Fabric license at the capacity level buys compute, not seats. The number that matters is F64. At F64 and above, anyone with a free license can view reports hosted on that capacity, up to roughly 5,000 viewers. Below F64 (F2 through F32), you still need Pro for every user, so small capacities give you performance, not licensing relief. Authors who publish always need at least Pro, even on a large capacity.

Power BI Per-user vs Fabric capacity pricing

All Power BI licenses eventually answer one question: are you paying per person, or per capacity?

Per-user pricing is simple and scales in a straight line. Ten users on Pro is $140 a month. A hundred is $1,400. Two hundred is $2,800. It stays cheap until your viewer count climbs, then the bill climbs right along with it.

Capacity pricing is flat. F64 runs roughly $8,400 a month pay-as-you-go, or about $5,000 a month on a one-year reservation, regardless of how many people read the reports (within that 5,000-viewer ceiling). The trade is that you pay the same whether 60 people or 4,000 people use it.

So there’s a break-even point, and it’s worth calculating before you commit. Compare the capacity bill against what you’d spend on Pro seats for your viewers.

Rule of thumb: F64 reserved (around $5,000 a month) is roughly 350 Pro viewers at $14 each. Below that headcount, per-user Power BI licenses are cheaper. Above it, capacity wins, and the gap widens quickly.

Authors are a separate line on the invoice. Even on F64, the people building and publishing content still need Pro or PPU seats. Capacity removes the viewer tax, not the builder tax.

When you actually need Fabric or a Premium license

Reach for capacity when one of these is true.

Your viewer count crosses roughly 300 to 350 people. Past that line, buying a Power BI premium license as Fabric capacity costs less than stacking up Pro seats.
You need Premium-only features across a wide audience. Paginated reports, large models, or frequent refresh delivered to hundreds of viewers is a capacity job, not a per-user one.
You’re building beyond Power BI. Fabric capacity also powers data engineering, warehousing, real-time intelligence, and notebooks on the same compute, so if reporting sits inside a broader Fabric project, the capacity is already paid for.
You want free-license viewing. The F64 threshold on a Microsoft Fabric license lets thousands of people read reports without individual seats.

Stay on per-user licensing when your audience is small, your needs fit inside Pro, and you’d rather not operate a capacity. Most teams under a couple hundred users come out cheaper and simpler on Pro.

A short decision flow

Run through these in order and stop at the first yes.

Working alone and only building for yourself? A Power BI free license covers it. Spend nothing.
Sharing reports with a handful of colleagues? Put everyone involved on Power BI Pro. This is the right answer for most small teams.
A few analysts need large models, paginated reports, or the XMLA endpoint, but the viewer group is also small? Give those users a Power BI Premium Per User license and keep the group tight.
Distributing to a few hundred viewers or more, or need Premium features at scale? Move to Fabric capacity, and size at F64 or above so viewers read content on free licenses.
Already running a Fabric project for engineering or warehousing? The capacity is bought, so publish your Power BI content there and license only your authors with Pro.

The mistake I see most often is jumping to capacity too early because it sounds enterprise-grade, or clinging to per-user seats long after 500 viewers turned them into the expensive option.

Count your viewers, check the break-even, and let the number pick the plan. When your Power BI licenses match your actual audience, the bill takes care of itself.

AI-901 Practice Exam: 40 Free Practice Questions with Answers

Maxime Marlot — Sun, 05 Jul 2026 19:12:27 GMT

AI-901: Microsoft AI Fundamentals, Practice Assessment

In a previous article, I shared my personal experience passing AI-901. Based on that, I started writing a full study guide and exam preparation for the new Microsoft Azure AI Fundamentals exam, the one now replacing AI-900. You can find both here:

My experience and the full study guide: AI-901 Study Guide and Exam Preparation
The condensed AI-901 cheat sheet (PDF): Subscribe below to get the PDF!

⚠️ These are original practice questions, written from Microsoft's published AI-901 skills outline and my own exam experience in July 2026. They are not real exam items. The exam is new and can still change, so use this for preparation, not as a guarantee.

How this works? Substack has no quiz tool, so all 28 questions come first, grouped into three sections. The answer key sits at the bottom of each section.

The exam breaks into three areas, so this practice test does too:

Responsible AI: 8 questions
AI concepts and workloads: 22 questions
Microsoft Foundry and Azure OpenAI: 10 questions

Go to the next step by preparing the DP-600: Fabric Analytics Engineer Associate

Responsible AI - AI-901 Practice assessment

1. A hiring model recommends noticeably fewer qualified candidates from one region than from others with identical qualifications. Which Microsoft responsible AI principle is most at risk?

A. Reliability and safety
B. Fairness
C. Transparency
D. Inclusiveness

2. A bank must be able to explain to a rejected applicant why the AI declined their loan. Which principle does this support?

A. Accountability
B. Privacy and security
C. Transparency
D. Fairness

3. Two review teams assess the same model. Team A finds it performs worse for older users. Team B finds that no one can explain how it reaches a decision. Which principles did Team A and Team B identify, in that order?

A. Transparency, then Fairness
B. Fairness, then Transparency
C. Fairness, then Accountability
D. Inclusiveness, then Transparency

4. A customer service chatbot must refuse harmful requests and behave predictably even with unexpected input. Which principle applies, and which Azure capability most directly supports it?

A. Privacy and security (managed identities)
B. Reliability and safety (content filters and model evaluation)
C. Transparency (model cards)
D. Accountability (audit logs)

5. A solution must keep customer personal data protected and let services authenticate without secrets stored in code. Which principle and feature best fit?

A. Inclusiveness
B. Fairness
C. Privacy and security
D. Transparency

6. A team adds captions, screen reader support, and multiple languages so their app works for people of all abilities and backgrounds. Which principle is this an example of?

A. Inclusiveness
B. Accountability
C. Reliability and safety
D. Fairness

7. An organization keeps an auditable record of AI decisions and assigns a named human owner who is answerable for outcomes. Which principle does this reflect?

A. Transparency
B. Accountability
C. Privacy and security
D. Reliability and safety

8. (Choose TWO.) Which two are ways Microsoft helps you apply responsible AI to a generative model?

A. Configure content filters on the model deployment
B. Increase max_tokens to reduce hallucination
C. Review the model card for capabilities and limitations
D. Set temperature to 1 for higher accuracy
E. Store the API key in the application source code

Answer key, Section 1: 1-B, 2-C, 3-B, 4-B, 5-C, 6-A, 7-B, 8-A/C

Subscribe now

AI concepts and workloads - AI-901 Practice assessment

9. A voice assistant must read the day’s weather aloud in a natural sounding voice. Which capability of Azure Speech is required?

A. Speech recognition
B. Speech synthesis
C. Speech translation
D. Language detection

10. In Microsoft Foundry, which statement correctly describes hubs and projects?

A. A project contains many hubs
B. Hubs and projects are the same thing
C. A hub is a top level container for governance, security, and quota, and a project lives inside it where you build
D. A hub is only for billing

11. You must automatically pull out and categorize people, organizations, dates, and locations across thousands of contracts. Which Azure Language capability fits best?

A. Sentiment analysis
B. Named entity recognition
C. Key phrase extraction
D. Language detection

12. A team needs a fast, low cost model for high volume, simple chat replies, and does not need advanced reasoning. Which choice best balances cost and capability?

A. GPT-4o
B. An embedding model
C. GPT-4o-mini
D. A Document Intelligence prebuilt

13. A call centre needs to turn recorded customer calls into searchable text transcripts. Which capability do you need?

A. Text to speech
B. Speech synthesis
C. Speech to text
D. Key phrase extraction

14. You want to browse available models, read a model’s capabilities and limits, then deploy one. In the Foundry portal, which order of areas do you use?

A. Model catalog, then Model card, then Deployments
B. Playground, then Agents, then Evaluation
C. Deployments, then Model catalog, then Playground
D. Agents, then Evaluation, then Model catalog

15. A company must automatically detect and redact personal information, such as names, phone numbers, and email addresses, from customer support transcripts before storing them. Which Azure Language capability should they use?

A. Key phrase extraction
B. PII detection
C. Sentiment analysis
D. Named entity recognition

16. You are building an app that extracts structured fields, including nested values, from invoices, images, and short audio notes, using a schema you describe in natural language. What should you use?

A. An OCR only document pipeline
B. A transcription workflow in Azure Speech in Foundry Tools
C. An analyzer in Azure Content Understanding
D. Azure AI Search

17. A brand wants to know whether social posts about its product are positive, negative, or neutral. Which text analysis technique applies?

A. Summarization
B. Entity recognition
C. Sentiment analysis
D. Translation

18. When you deploy a model in Foundry, which setting controls how many tokens per minute the deployment can handle, and therefore affects throughput and cost?

A. Temperature
B. The system prompt
C. The endpoint name
D. Capacity

19. (Choose TWO.) Which two are text analysis capabilities of Azure Language?

A. Named entity recognition
B. Text to speech
C. Sentiment analysis
D. Image generation
E. Optical character recognition of scanned forms

20. You only need to extract fields from a known, standard form type (supplier invoices) with high, deterministic accuracy. Which is most appropriate?

A. A custom Content Understanding analyzer
B. A Document Intelligence prebuilt model
C. A multimodal chat model
D. Azure AI Search

21. You just need the raw printed and handwritten text extracted from scanned pages, with no schema and no field mapping. Which capability fits?

A. OCR
B. Sentiment analysis
C. Text to speech
D. Key phrase extraction

22. You deployed a gpt-4o model under the deployment name prod-chat. In your client code, which value do you pass as the model to call it?

A. gpt-4o, the base model name
B. The resource endpoint URL
C. prod-chat, the deployment name
D. The Azure region

23. An app should accept a photo and answer free form questions about it, for example “what is unusual here?”. What is the best fit?

A. An OCR algorithm
B. A deployed multimodal model
C. Key phrase extraction
D. Speech synthesis

24. You need to extract fields from a document type that is unique to your company and is not a standard form. Which Content Understanding option fits?

A. A prebuilt analyzer
B. Read (OCR)
C. Speech to text
D. A custom analyzer

25. A marketing team wants to generate brand new images from text descriptions. Which model type do they need?

A. An embedding model
B. DALL·E
C. A speech model
D. A Document Intelligence prebuilt

26. Before writing any code, where in the Foundry portal can you interactively test a deployed model with different prompts?

A. The Evaluation tab
B. The Playground
C. The Model catalog
D. The Hub settings

27. In a generative AI model, what are embeddings?

A. The maximum length of a response
B. Numeric vector representations
C. The system prompt rules
D. The deployment region

28. (Choose TWO.) What are two purposes of the system prompt (the instructions) for a generative model?

A. Define the model’s role and behaviour
B. Select which model to deploy
C. Define constraints on the model’s responses
D. Set the tokens per minute quota
E. Define the user question and intent

29. A chatbot keeps giving outdated answers about your company’s internal policies. Which technique most directly improves accuracy by supplying your own data at query time?

A. Increasing temperature
B. Fine tuning the base model from scratch
C. Grounding the model with retrieval (RAG)
D. Raising max_tokens

30. (Choose TWO.) Which two factors make a smaller model such as GPT-4o-mini a good choice for a task?

A. Lower cost
B. Lower latency
C. It always produces more accurate answers
D. It is required for responsible AI
E. It supports more languages than larger models

Answer key, Section 2: 9-B, 10-C, 11-B, 12-C, 13-C, 14-A, 15-B, 16-C, 17-C, 18-D, 19-A and C, 20-B, 21-A, 22-C, 23-B, 24-D, 25-B, 26-B, 27-B, 28-A and C, 29-C, 30-A and B

Building with Microsoft Foundry: SDK, prompts and services - AI-901 Practice assessment

31. You are writing a Python app that uses Azure Speech in Foundry Tools. Which object holds the credentials and the service region so the speech service can be called?

A. AudioConfig
B. SpeechRecognizer
C. SpeechConfig
D. AIProjectClient

32. In the Azure Speech SDK, which object specifies where the audio comes from or goes to, for example the microphone or an audio file?

A. SpeechConfig
B. AudioConfig
C. SpeechSynthesizer
D. DefaultAzureCredential

33. You want your app to convert spoken audio into text. After creating a SpeechConfig, which object do you create to perform the transcription?

A. SpeechSynthesizer
B. A Content Understanding analyzer
C. An image analyzer
D. SpeechRecognizer

34. In the Foundry SDK, what is the recommended way to authenticate from your code without putting keys in the source?

A. Use DefaultAzureCredential with Microsoft Entra ID
B. Hard code the API key in a variable
C. Store the key in the system prompt
D. Pass the key as the model name

35. You are initializing the Foundry project client in Python. The Azure AI resource (account) is Resource1, the project is project1, and the model gpt-4o is deployed under the name my-mini-gpt. Which URL is the correct endpoint, the base URL you pass to the client?

A. https://project1.services.ai.azure.com
B. https://my-mini-gpt.services.ai.azure.com
C. https://resource1.services.ai.azure.com/api/projects/project1
D. https://gpt-4o.openai.azure.com

36. You need a GPT model to hold a back and forth conversation and return text responses to user messages. Which capability do you call?

A. The embeddings API
B. The chat completions API
C. The image generation API
D. Speech synthesis

37. In the Foundry SDK, which method do you call on the OpenAI client to generate a chat completion from a deployed model?

A. client.completions.generate(...)
B. client.chat.run(...)
C. client.models.invoke(...)
D. client.chat.completions.create(...)

38. (Choose TWO.) When you configure Azure Speech in Foundry Tools in code, which two values do you typically set on the SpeechConfig?

A. The service region
B. The subscription key or credential
C. The deployment name of a GPT model
D. The number of agent threads
E. The Azure resource group

39. When should you use an agent instead of a single chat completion call?

A. When you need one stateless answer
B. When you only need to translate text
C. When you need tools, state across turns, or multi step actions
D. When you want lower cost for a single reply

40. You need a deployed model to give consistent, repeatable answers for the same input. Which setting should you adjust?

A. Set temperature to its maximum
B. Set a low temperature
C. Increase max_tokens
D. Change the deployment name

Answer key, Section 3 (questions 31 to 40): 31-C, 32-B, 33-D, 34-A, 35-C, 36-B, 37-D, 38-A and B, 39-C, 40-B

Conclusion

That wraps up this AI-901 practice exam: 40 AI-901 practice questions with answers across responsible AI, AI concepts, and Microsoft Foundry. Use it as your final AI-901 exam preparation, review the full AI-901 Study Guide and the AI-901 cheat sheet PDF for anything shaky, and retake these AI-901 practice questions until you are ready to pass Azure AI Fundamentals on the first attempt. Subscribe for more AI-901 exam prep and study guide updates, and good luck passing AI-901.

AI-901 Study Guide and Exam Preparation

Maxime Marlot — Fri, 03 Jul 2026 20:03:14 GMT

I just passed AI-901, the new Microsoft Azure AI Fundamentals exam that replaces AI-900 (which retires on June 30, 2026). If you've started your AI-901 exam preparation, you've already noticed the problem: there's barely any AI-901 study material out there yet, and no official AI-901 practice test. So while it was fresh, I wrote down exactly what's on it and where the points hide. In addition, I made a 40 questions practice assessment for you to practice and a PDF version of this AI-901 study guide and cram sheet.

⚠️ Read this first. AI-901 is brand new. Official content is still thin, there’s no official practice assessment yet, and Microsoft can adjust the exam at any time. Everything below reflects the public study guide and my own sitting as of July 2026 — treat it as a field guide, not gospel.

This is the study guide I wish I’d had. Let’s get you certified.

How the AI-901 exam splits — and where the points are

The cleanest way to think about AI-901 is three buckets. These question counts are my own read from the exam, not official weights — but they’ll tell you where to spend your energy:

Microsoft Responsible AI — 25% of the exam
General AI use cases (generative AI, NLP, and vision in a multimodal context — no classic ML theory) — 50% of the exam
Foundry / Azure OpenAI — (the hardest, with the least Microsoft Learn coverage) - 25% of the exam

The one-line strategy: Sections 1 and 2 are your easy, bankable points — learn them properly and you walk in with momentum.

🚫 What is NOT in the AI-901: classic machine-learning theory. I got zero questions on supervised vs unsupervised learning, regression, or classification. Don’t spend a minute studying it. This a major difference with the AI-900.

Two facts that should change how you prepare:

It’s not open book. Information from different sources diverge on this point. In my case, there was no access to MS Learn documentation.
Python is read, not write. You’ll be shown a 10–15 line SDK snippet and asked what it does or which line breaks it. You never have to write code.

The rest of the essentials: passing score 700/1000, roughly 40–60 questions (In my case, it was 42), about 45–60 minutes.
And the overall vibe is implementation-aware — expect portal screens and code-reading, not just textbook definitions.

AI-901 - Core concepts to study

Subscribe now

AI-901 - Microsoft Responsible AI (learn all six principles)

This is the section people underestimate. On AI-901, Responsible AI is widely tested, the closest thing to free points on the exam, if you can match a scenario to the right responsible-AI principle. The AI-901 exam rarely asks for a definition in the abstract; it describes a situation and asks which principle applies, and often which Azure feature upholds it.

So learn all six concepts, a concrete example, and the Azure feature it maps to.

Fairness

Concept: the system treats all groups equitably, without bias.
Example: a loan-approval model rejects far more applicants from one neighborhood, that’s a fairness failure.
On Azure: fairness assessments and balanced training data.

Reliability & safety

Concept: the system behaves consistently and safely, even with unexpected or hostile input.
Example: a chatbot refuses to produce harmful content and stays stable under weird prompts.
On Azure: content filters and model evaluation.

Privacy & security

Concept: personal and sensitive data is protected end to end.
Example: customer PII never leaks into a response or a log.
On Azure: managed identities and private endpoints.

Inclusiveness

Concept: the solution works for people of all abilities, languages, and backgrounds.
Example: captions for audio and multilingual support so no one is shut out.
On Azure: accessibility and multilingual capabilities.

Transparency

Concept: people can understand how the system works and why it made a decision.
Example: being able to explain why an applicant was rejected.
On Azure: model cards.

Accountability

Concept: humans remain responsible for the system’s outcomes.
Example: a reviewable record of what the AI did and who signed off.
On Azure: audit logs and abuse monitoring.

Quick recap:

Fairness — equitable across groups. On Azure: fairness assessment / balanced data.
Reliability & safety — consistent and safe under stress. On Azure: content filters + evaluation.
Privacy & security — protects data. On Azure: managed identities / private endpoints.
Inclusiveness — works for everyone. On Azure: accessibility & multilingual.
Transparency — understandable decisions. On Azure: model cards.
Accountability — humans stay responsible. On Azure: audit logs + abuse monitoring.

If you can read a scenario and instantly name the principle and the feature, you’ve banked this whole section.

Thanks for reading HiStack.net - AI & System Design Newsletter! This post is public so feel free to share it.

AI-901 - General AI use cases (pick the right service)

Section 2 is really one skill dressed up many ways: read a scenario and choose the right task or service — often in a multimodal setting. Memorize this decision tree and most of these questions answer themselves:

Content Understanding (build an analyzer) → describe the fields you want in plain language; an LLM reasons over docs, images, audio or video and returns structured JSON
Content Understanding – Read → OCR text from a page or image
Content Understanding – Layout → text, tables, selection marks, structure
Document Intelligence prebuilt → known forms: invoices, receipts, IDs, business cards
Azure Speech → speech-to-text and text-to-speech
Azure Language → sentiment, entities, PII, summarization
Azure OpenAI / Foundry models → generate, reason, or understand an image

NLP (Azure AI Language). Entity recognition (NER), key phrase extraction, sentiment analysis, PII detection, language detection, summarization.
Entity recognition categorizes specific things (people, dates, organizations);

Speech (Azure AI Speech). The distinction I’d bet on seeing: speech recognition = speech-to-text (transcribe a call-center recording) vs speech synthesis = text-to-speech (a navigation app reading directions aloud, with neural voices). Recognition listens; synthesis speaks.

Vision & multimodal. Azure AI Vision handles captions, tags, and OCR. But a deployed multimodal model (like GPT-4o) can read an image directly in the prompt and reason about it. Extracting text off a scanned form is OCR; asking “what’s unusual in this photo?” is multimodal reasoning.

The extraction trap — Content Understanding vs Document Intelligence. This one’s a favorite:

Content Understanding — you describe the fields you want in natural language, it handles multimodal sources (documents, images, audio, video), and returns structured JSON. Choose it for free-form or non-document input.
Document Intelligence prebuilts — deterministic extraction from known form types (invoices, receipts, IDs).

Generative AI basics. Know the vocabulary plainly: tokens (chunks of text), embeddings (numeric vectors of meaning), prompts (your input), grounding / RAG (feeding the model your own data so answers stay accurate), and hallucination (confident but wrong output).

AI-901 - Foundry / Azure OpenAI

This is the hard bucket and the one with the thinnest Microsoft Learn coverage when I sat it. Slow down here.

Azure AI Foundry — the platform

Foundry is the unified platform to build, evaluate, and deploy AI on Azure. It pulls together the Foundry portal, the model catalog, the Foundry SDK, the Agent Service, prompt flow / evaluation, and the integrated Azure AI services (Speech, Vision, Content Understanding, Search) exposed as Foundry Tools.

Hub vs Project — the classic trap. A hub is the top-level collaboration and governance container: shared security, connections, compute, and quota.
A project lives inside a hub and is where you actually build — deployments, agents, data, evaluations. One hub → many projects.

Portal map to recognize on screen: Model catalog → Model card → Deployments → Playground → Agents → Evaluation. Questions describe or show these tabs, so know what each one does.

Deploy & configure a model

The catalog holds Azure OpenAI models (GPT-4o, GPT-4o-mini for cheap/fast, embeddings, DALL·E for images), open-weight models, and Microsoft-published models. Pick by capability — multimodal for image/speech input, a small model for cost and latency.

Deployment options that show up as answers: region, capacity (tokens-per-minute / quota), and the content filter attached to the deployment. Crucial detail: the deployment name — not the base model name — is what your code calls.

temperature — higher = more creative/random; lower = more deterministic. Exam cue: “consistent, repeatable output” → low temperature.
top_p — nucleus sampling, an alternative to temperature (tune one, not both). Exam cue (distractor): “set both to 1 for accuracy.”
max_tokens — caps response length (prompt + completion share the context window). Exam cue: “response got cut off” → raise max_tokens.

Prompt roles — the single most-tested distinction

system — sets persona, tone, rules & guardrails for the whole conversation. Contains: “Be formal”, “only answer HR policy”, “stay on topic”.
user — the end-user’s actual request this turn. Contains: “Summarize this contract.”
assistant — prior model replies, replayed to give conversational memory. Contains: earlier answers in the thread.

🎯 The trap: behavioral instructions (”respond formally”, “stay on topic”) belong in the system prompt. Putting them in the user prompt or in deployment settings is the planted wrong answer.

Reading the Foundry SDK (Python)

You won’t write this — you’ll read it.

python

from azure.identity import DefaultAzureCredential
from azure.ai.projects import AIProjectClient

project = AIProjectClient(                       # 1. connect to the project
    endpoint=”https://.services.ai.azure.com/api/projects/”,
    credential=DefaultAzureCredential())         #    Entra ID, not an API key

client = project.get_openai_client()             # 2. OpenAI-compatible client

resp = client.chat.completions.create(
    model=”gpt-4o-mini”,                         # 3. the DEPLOYMENT name
    messages=[
        {”role”: “system”, “content”: “You are a concise travel guide.”},
        {”role”: “user”,   “content”: “Three things to do in Paris?”}])

print(resp.choices[0].message.content)           # 4. read the reply

What gets asked: what this does · which role sits on which message · which line breaks it if removed (auth, client, or the deployment name). Exact method names vary by SDK version — the exam tests the roles, flow, and return shape, not signatures.

Agents in Foundry

An agent is a persistent assistant: model + instructions + tools + threads (conversations) + runs (executions), built through the Agent Service. Its built-in tools are file search (grounding on your docs), code interpreter (runs code / analyzes data), and function calling (invokes your APIs).

Agent vs a single chat call (tested): use an agent when you need tools, state across turns, or multi-step actions; use a plain chat completion when one stateless response is enough.

The traps & X-vs-Y cheat sheet

Hub vs Project — Hub = governance/security/quota container; Project = where you build.
Deployment name vs model name vs endpoint — Deployment = your custom name (call this); Model = e.g. gpt-4o; Endpoint = resource URL + key.
temperature vs top_p — two ways to control randomness; tune one, not both.
system vs user prompt — behavioral rules go in system, not user.
Agent vs chat call — tools/state/multi-step → agent; one stateless answer → chat.
Content Understanding vs Document Intelligence — multimodal/free-form → Content Understanding; known forms → Document Intelligence.
Speech recognition vs synthesis — recognition = speech-to-text; synthesis = text-to-speech.

My final strategy to pass AI-901

Bank Sections 1 and 2. Learn all six responsible-AI principles properly and drill service selection — these are your easy points.
Pour your real study time into Section 3 (Foundry). It’s the biggest, hardest, thinnest-documented part.
Practice reading SDK snippets and recognizing portal tabs, not memorizing definitions.
Know the traps cold — hub vs project, deployment-name-vs-endpoint, prompt-role placement, Content Understanding vs Document Intelligence.
Personal note: I saw a lot of Speech/Audio SDK configuration questions. Review Speech, but don’t over-fit — I can’t confirm which other SDK services show up.
Skip classic ML theory. It wasn’t there.

AI-901 FAQ

Is AI-901 hard? It’s a fundamentals exam, so it’s approachable — but the Foundry / Azure OpenAI section is genuinely harder than anything on the old AI-900, mostly because there’s so little material to study from yet.

How many questions is AI-901, and how long? Expect roughly 40–60 questions in about 45–60 minutes — around a minute per question.

Is AI-901 open book? No, although information from different sources diverge on this point. In my case, there was no access to MS Learn documentation.

Do I need to know Python for AI-901? Only to read it. You’ll parse short SDK snippets; you never write code.

Is my AI-900 certification still valid? Yes. AI-900 and AI-901 earn the same credential — Microsoft Certified: Azure AI Fundamentals — and it doesn’t expire.

What’s the passing score? 700 out of 1000.

Where to go next

Concepts stick when you test them — especially with no official practice assessment out yet.

👉 Test yourself: I built a set of 20+ free AI-901 practice questions with answers and explanations, organized by these same three sections → AI-901 Practice Questions(opens in new window).
📥 Keep the cheat sheet: subscribers can download my AI-901 PDF study guide — every principle, decision tree, and trap on one quick reference → Download the PDF(opens in new window).

If this made AI-901 click, subscribe — I’m releasing the practice set and study guide as a short AI-901 series and updating everything as Microsoft firms up the official content. Good luck; it’s a very passable exam once you know where the points hide.

Essential Python Libraries for Data Science

Maxime Marlot — Mon, 24 Mar 2025 08:02:50 GMT

If you're looking to start your journey in data science, one of the first questions you might ask is: What tools should I use? Python is the go-to language for data science, and it offers a powerful ecosystem of libraries to help you get started.

Key Python Libraries for Data Science: Machine Learning, NLP, Data Visualization, and Computer Vision

We will break down the key Python libraries you need to know where to start your data science journey. Whether you're working on machine learning, data visualization, natural language processing, or computer vision, these libraries will set you on the right path.

Getting Started with Data Science in Python

Before diving into coding, it's important to understand the fundamental steps of data science:

Data Collection & Preparation – Cleaning and structuring data for analysis.
Exploratory Data Analysis (EDA) – Understanding patterns and trends.
Machine Learning & AI – Building predictive models.
Data Visualization – Communicating insights through charts and graphs.
Deployment – Integrating models into real-world applications.

To tackle these steps, let’s look at the essential Python libraries you need to start your data science journey.

Best Python Libraries for Data Science

1. Machine Learning Libraries

Machine learning is a key part of data science, and these libraries will help you build models efficiently:

Scikit-learn – A beginner-friendly library for traditional machine learning models like regression, classification, and clustering.
Pandas – The best tool for data manipulation and analysis, helping you structure datasets for machine learning.
NumPy – Provides numerical computing power, essential for handling large datasets.
XGBoost – A high-performance library for building powerful predictive models using gradient boosting.

2. Data Visualization Libraries

Data visualization helps you understand and present data insights clearly:

Seaborn – Great for statistical data visualization, making charts visually appealing.
Plotly – Enables interactive and dynamic visualizations for dashboards.
Streamlit – Helps build interactive web applications for data science projects.
UMAP – Primarily used for dimensionality reduction but also useful for visualizing high-dimensional data.

3. Natural Language Processing (NLP) Libraries

If you're working with text data, these libraries will help you analyze and process it efficiently:

Hugging Face Transformers – The best library for working with pre-trained language models like BERT and GPT.
spaCy – A fast and efficient NLP library for tokenization and entity recognition.
LangChain – Ideal for building applications that interact with large language models (LLMs).
vLLM – Optimized for running LLMs efficiently, improving performance.

4. Computer Vision Libraries

For those interested in image processing and deep learning, these libraries are essential:

OpenCV – The most popular library for image processing and real-time computer vision.
Scikit-Image – A specialized tool for advanced image processing within the SciPy ecosystem.
TensorFlow & PyTorch – Two leading deep learning frameworks for training AI models.

How to start learning Data Science?

If you're new to data science, follow these steps to get started:

Learn Python Basics – Get comfortable with Python syntax and basic programming concepts.
Master Pandas and NumPy – These two libraries are the foundation of data analysis.
Practice with Real Data – Use Kaggle datasets or your own data for hands-on projects.
Understand Machine Learning – Start with Scikit-learn to build simple models.
Work on Visualization – Learn Seaborn and Plotly to present your insights effectively.
Explore NLP or Computer Vision – Depending on your interest, try Hugging Face for text or OpenCV for images.

How to Build a RAG Pipeline for AI: Improve LLMs with Retrieval-Augmented Generation

Maxime Marlot — Wed, 19 Mar 2025 03:00:42 GMT

Large Language Models (LLMs) like GPT-4, Claude, and Gemini are incredibly powerful, but they have some major limitations:

Limited Context Window – LLMs can only process a fixed number of tokens per prompt.
Static Knowledge – Once trained, they cannot update their knowledge unless retrained on new data.
Hallucinations – LLMs sometimes generate false or misleading information because they try to predict plausible answers rather than retrieving factual data.

RAG pipeline implementation: Enhancing LLMs with real-time knowledge retrieval.

RAG (Retrieval-Augmented Generation) enhances LLMs by allowing them to retrieve relevant external information in real time, rather than relying solely on their pre-trained knowledge. This significantly improves accuracy, making AI models more useful for real-world applications like chatbots, customer support, and research assistants.

How Does a RAG Pipeline Work?

Step 1: Ingesting and Processing Documents

Before an LLM can retrieve external knowledge, it needs a source of information. The first step is document ingestion, where raw data is extracted and processed from different formats, including:

Text files (PDFs, Word documents, PowerPoint slides)
Images & Scanned Documents (processed via Optical Character Recognition - OCR)
Web Pages & Databases

Why is document ingestion necessary?

LLMs can’t read raw files directly.
Extracting and formatting text ensures structured data processing for later retrieval.

💡 Tools for document ingestion:

LangChain – Handles multiple file formats efficiently.
PyMuPDF – Extracts text from PDFs.
Tesseract OCR – Converts images and scanned documents into text.

Step 2: Splitting Text into Chunks

Once the documents are ingested, they are broken down into smaller chunks for efficient retrieval.

Why do we split text into chunks?

LLMs work best with small, manageable pieces of text rather than large documents.
Smaller text chunks allow for faster and more relevant search results.

Best practices for text chunking:

Use overlapping chunks to preserve context.
Adjust chunk sizes based on document type (e.g., longer chunks for structured text like legal documents).

Note: If you have a 1,000-word article, chunking might create 10 sections of 100 words each, making retrieval faster and more precise.

Step 3: Converting Text to Embeddings

Each text chunk is then converted into a numerical representation known as an embedding.

What are embeddings?
Embeddings are vector representations of text that help the system find semantically similar content instead of relying on exact word matches.

Example: The phrase "AI in healthcare" will have an embedding close to "Machine learning in medicine" because of their conceptual similarity.

💡 Popular embedding models:

OpenAI’s text-embedding-ada-002
Google’s BERT
Hugging Face’s Sentence Transformers

Step 4: Storing Data in a Vector Database

Once the text embeddings are generated, they are stored in a vector database for fast retrieval.

Why use a vector database?

It allows quick similarity searches to find the most relevant information.
It supports real-time updates, so new data can be added without retraining the LLM.

💡 Popular vector databases for RAG:

FAISS (Facebook AI Similarity Search)
Pinecone (Optimized for production environments)
Azure AI Search DB

Step 5: Querying the RAG Pipeline

When a user submits a question or search query, the system follows these steps:

Convert the query into an embedding (same way document chunks were converted).
Search the vector database for the most relevant chunks.
Retrieve the top N chunks (e.g., the most similar 3-5 pieces of text).
Combine the query and retrieved text to generate a complete response.

Why is this better than traditional LLMs?

Instead of relying only on its pre-trained knowledge, the LLM gets real-time information from retrieved documents.
This makes the generated response more accurate and contextually relevant.

Step 6: Generating the Final Response

Finally, the retrieved text is fed into the LLM alongside the user query. The model processes the expanded context and generates a response that is:

More accurate
Less prone to hallucinations
Based on real-time information

This step completes the loop, allowing AI models to provide data-driven, up-to-date answers.

6 Best Practices for REST API Design

Maxime Marlot — Thu, 13 Mar 2025 09:02:22 GMT

APIs are the backbone of modern applications, and large-scale services like ChatGPT demonstrate why proper API management is critical. With over 300 million users per week and processing 1 billion queries daily, ChatGPT relies on robust API architecture to ensure security, uptime, and response time. Many of these best practices stem from software engineering principles, and in this article, we will review six key techniques to optimize REST API design.

Best Practices for REST API Design

1. What is Rate Limiting in REST APIs? (How to Prevent API Abuse)

Prevents user abuse and improves stability

Rate limiting controls how many API requests a user can make within a specific timeframe. This helps prevent system overloads, ensures fair usage, and protects against malicious attacks such as DDoS (Distributed Denial-of-Service). Implementing rate limiting through tools like API gateways or middleware ensures a more stable and secure API.

2. How Does Pagination Improve REST API Performance?

Reduces data load and speeds up responses

When an API returns large datasets, sending all the data at once can slow down performance. Pagination breaks down responses into smaller, manageable chunks, improving response time and reducing server strain. Implementing cursor-based or offset-based pagination enhances efficiency, especially for databases with extensive records.

3. Why Are API Keys Important for Security? (How to Secure Your API)

Prevents unauthorized API access

Authentication and authorization are critical for API security. API keys serve as a simple yet effective method to control access and prevent unauthorized usage. However, for enhanced security, consider using OAuth or JWT (JSON Web Tokens) for authentication alongside API keys.

4. What is Stateless Architecture in REST APIs? (Why It’s Important)

Simplifies scaling and session management

A RESTful API should be stateless, meaning that each request from a client contains all the necessary information to process it without relying on stored session data. This design principle enhances scalability and allows APIs to handle multiple concurrent requests efficiently. Statelessness simplifies load balancing and improves fault tolerance.

5. How Does Caching Improve REST API Speed? (Boost API Performance)

Speeds up responses

APIs that serve frequently requested data can benefit from caching mechanisms. Caching reduces database queries and speeds up response times by storing copies of responses at different layers (client-side, server-side, or CDN). Implement cache-control headers to manage data freshness and optimize API performance.

6. Why is API Versioning Important? (How to Avoid Breaking Changes)

Maintains compatibility during changes

APIs evolve over time, and changes can break existing integrations. Versioning allows developers to introduce new features without disrupting existing users. Using versioning techniques like URL-based (/v1/resource) or header-based versioning ensures backward compatibility while enabling future enhancements.

End-to-End Big Data Applications Architecture

Maxime Marlot — Mon, 10 Mar 2025 11:51:22 GMT

Data is the backbone of modern decision-making, driving everything from business strategies to AI-powered applications. However, raw data alone holds little value—it must be processed, analyzed, and structured into meaningful insights. This article breaks down an End-to-End Data Applications Architecture, explaining how data moves through a system from collection to deployment.

Big Data Applications Architecture Diagram

Data Collection and Storage

Organizations deal with various types of data:

Structured Data – Information stored in databases and spreadsheets, such as customer records or transaction logs.
Unstructured Data – Free-form data like emails, images, and documents that require additional processing before use.

Data collection is managed through time-based triggers or event-driven mechanisms, ensuring that new data is ingested at scheduled intervals or in response to real-time events. The data is then stored in a Data Lake, a centralized repository designed to handle both structured and unstructured data efficiently.

Data Processing and Preparation

Once collected, raw data must be transformed into a structured, usable format:

Data Exploration – Identifying patterns, anomalies, and trends in the dataset.
Data Preprocessing – Cleaning and normalizing data to remove inconsistencies and missing values.
Data Science Algorithms – Applying statistical and machine learning techniques to extract deeper insights.
Machine Learning Models – Training AI models to detect patterns and make predictions based on historical data.

This stage is critical for ensuring data quality and reliability before further processing or deployment.

Automation and System Integration

To maintain efficiency and scalability, automation plays a key role:

Automation Nodes – Manage workflows, schedule tasks, and ensure smooth data movement.
API Nodes – Provide interfaces for external applications to request and interact with processed data in real-time.

Automation reduces manual effort, streamlines data pipelines, and enables seamless integration with other business applications.

Deployment and Delivery of Insights

The final step is delivering insights to the right systems or users. This is achieved through Deployment Pipelines, which ensure that:

AI models are updated with new data.
Processed insights are integrated into business dashboards or applications.
Predictions and decisions are available in real-time or on demand.

Efficient deployment ensures that data-driven decisions can be made quickly and accurately, supporting business operations and AI-driven applications.