A lakehouse maintenance task aims to improve query performance and reduce storage costs for a Delta table with seven days retention and many small files. Which combination of commands should you run?

Prepare for the DP-600 Fabric Analytics Engineer Exam. Study with flashcards and multiple choice questions, each offering hints and detailed explanations. Enhance your chances of success on the exam!

Multiple Choice

A lakehouse maintenance task aims to improve query performance and reduce storage costs for a Delta table with seven days retention and many small files. Which combination of commands should you run?

Explanation:
When you have many small files, query performance suffers and storage costs stay high because the engine has to open and merge lots of tiny pieces. The key steps are to first consolidate those small files and then remove the older, unnecessary files. Running the optimization step reshapes the data layout by combining small files into larger base files, which speeds up scans and reduces the overhead of reading many tiny files. After that, applying a vacuum with a seven-day retention cleans up files that are no longer needed beyond that retention window. This preserves only the data from the most recent seven days while freeing up storage space from obsolete files and artifacts created during optimization. Running only the vacuum wouldn’t improve performance much if the data remains fragmented with many small files. Running only optimization leaves older, unnecessary files in storage. And ANALYZE doesn’t address file fragmentation or removal of old data; it gathers statistics for query planning rather than housekeeping. So the best approach is to optimize first to improve layout, then vacuum with a seven-day retention to prune outdated files.

When you have many small files, query performance suffers and storage costs stay high because the engine has to open and merge lots of tiny pieces. The key steps are to first consolidate those small files and then remove the older, unnecessary files.

Running the optimization step reshapes the data layout by combining small files into larger base files, which speeds up scans and reduces the overhead of reading many tiny files. After that, applying a vacuum with a seven-day retention cleans up files that are no longer needed beyond that retention window. This preserves only the data from the most recent seven days while freeing up storage space from obsolete files and artifacts created during optimization.

Running only the vacuum wouldn’t improve performance much if the data remains fragmented with many small files. Running only optimization leaves older, unnecessary files in storage. And ANALYZE doesn’t address file fragmentation or removal of old data; it gathers statistics for query planning rather than housekeeping. So the best approach is to optimize first to improve layout, then vacuum with a seven-day retention to prune outdated files.

Subscribe

Get the latest from Examzify

You can unsubscribe at any time. Read our privacy policy