Power BI PA220: Database systems for data analytics Microsoft Power BI • an interactive data visualization software • focusing on business intelligence. • allows collaborating on and sharing customized dashboards and interactive reports. • a collection of software services, apps, and data connectors • to turn unrelated sources of data into coherent, visually immersive, and interactive insights. • data input directly from a database, webpage, or structured files • e.g., spreadsheets, CSV, XML, and JSON. December 14, 2022 PA220 DB for Analytics 2 Power BI Ecosystem • Power BI.com • online platform, for O365 tenants • Desktop • free app for dashboarding, reporting, ad-hoc visualizations • Mobile Apps • access to reports, … from mobile devices • Visuals Marketplace • store of visuals and R-powered visuals • Gateway • sync external data in and out for automated refreshes • Dataflow • implementation of power query for data transformations • Dataset • interface of data sources to reports and visualizations • Embedded • REST API for custom solutions • Report Server • on-premise solution for running dashboard and reports December 14, 2022 PA220 DB for Analytics 3 Power BI System Architecture December 14, 2022 PA220 DB for Analytics 4 Power BI announcements • February 2016: • Gartner has positioned Microsoft as a Leader, for the ninth consecutive year, in the Magic Quadrant for Business Intelligence and Analytics Platforms • For the first time, Microsoft is placed furthest in vision within the Leaders quadrant • Gartner positions Microsoft as a leader in BI and Analytics Platform Parts of Power BI • Desktop, service, and Mobile apps • designed to let people create, share, and consume business insights • in the way that serves them, or their role, most effectively. December 14, 2022 PA220 DB for Analytics 6 Power BI Concepts • Major building blocks: datasets, reports and dashboards • Organized in workspaces • Dataset • a collection of data that you import or connect to, • associated with workspaces, • a single dataset can be part of many workspaces. • Report • one or more pages of visualizations (visuals) such as line charts, maps, and treemaps • Dashboard • typically, a pinned visual of a report, but not necessarily • created in Power BI service December 14, 2022 PA220 DB for Analytics 7 Power BI Concepts • Why create dashboards? • to see, in one glance, all the information needed to make decisions, • to monitor the most-important information about your business, • to ensure all colleagues are on the same page, viewing and using the same information, • to monitor the health of a business or product or business unit or marketing campaign, etc., • to create a personalized view of a larger dashboard and show all the metrics that matter to them. December 14, 2022 PA220 DB for Analytics 8 Contents • Introducing Power BI Desktop • Creating queries • Configuring relationships • Enhancing the model • Securing the model • Designing reports • Publishing to Power BI • Describing additional capabilities • How to enable preview features • Additional resources December 14, 2022 PA220 DB for Analytics 10 Power BI Desktop December 14, 2022 PA220 DB for Analytics 11 Find out what is new in the latest release or interact on the highly interactive forums, which are a great way to ask questions Start a new import from a new source, or connect to a recently used source Open a recent file or other report Tutorial Videos Announcements Power BI Desktop December 14, 2022 PA220 DB for Analytics 12 1 – ribbon 2 – report view or canvas; displays also data and model 3 – pages tab (individual report pages) 4 – visuals pane 5 – fields pane Introducing Power BI Desktop • User experience: Data view • Work in Data View to inspect, explore, and understand data in the model • It is a different experience from how you can view tables, columns, and data in Query Editor • This is a view of the data after it has been loaded into the model Introducing Power BI Desktop • User experience: Relationships view • Work in Relationships View to view all tables, columns, measures and relationships in the model Introducing Power BI Desktop • Complex schemas: Development methodology • In contrast to the simple and quick “get then visualize” approach, more complex schemas may follow this methodology: 1. Create queries and use the Query Editor to filter, cleanse and reshape data 2. Configure/refine relationships to establish the foundations of a model 3. Enrich the model with calculation logic and formatting 4. Design interactive reports with a broad range of modern data visualizations 5. Publish solutions directly to the Power BI service Create queries Configure relationships Enhance the model Design reports Publish to Power BI Creating queries • Queries can be edited before loading into the model • Once created, each query: • Is referenced by a unique name • Defines query steps by using the language “M” • Most query step logic does not require writing formulas • Advanced logic can be written, and even exposed as functions • Is used to load data into a model table Create queries Configure relationships Enhance the model Design reports Publish to Power BI Creating queries • Source types: File • Source types: Database Creating queries • Source types: Azure Creating queries • Source types: Online Services •The list also includes: • Smartsheet • SQL Sentry (Beta) • Stripe (Beta) • SweetIQ (Beta) • Troux (Beta) • Twilio (Beta) • tyGraph (Beta) • WebTrends (Beta) • Zendesk (Beta) * As at 25 August, 2016 And the list is growing! Creating queries • Source types: Other • Note: The following are not supported for on-premises refresh • Hadoop File (HDFS) • Microsoft Exchange Creating queries • DirectQuery mode • Connections made to supported data sources can connect directly by using DirectQuery: • SQL Server • Azure SQL Database • Azure SQL Data Warehouse • SAP HANA • Oracle Database • Teradata Database • Amazon Redshift (Preview) • Benefits: • Allows building visualizations over very large datasets • Reports always use current data Creating queries • DirectyQuery mode: Continued • Limitations: • All tables must come from a single database • The Query Editor query cannot be overly complex • DAX formula are restricted to a subset of the DAX functions • Relationship filtering is limited to a single direction, rather than both directions • If enabled, a preview feature allows cross filtering in both directions • The data type of a column cannot be changed Creating queries • Defining query steps • The Query Editor can be used to filter, cleanse and reshape data into the desired result • Steps can easily be produced by applying column filters, and by using the commands available from the ribbon, or the query and column context menus • It is possible to select a step and preview the data at that step • Steps can be removed – but take care not to remove a step that downstream steps depend on • Step formulas can be viewed or edited in the formula bar Creating queries • Defining query steps: Commands • Numerous commands are available on the Query Editor ribbons and context menus to: • Manage columns • Reduce rows, including removing errors • Transform • Split • Add columns, by using formulas • As the editor experience is WYSIWYG, users can easily experiment and rollback changes Creating queries • Defining query steps: Combining queries • New queries can be created by: • Merging two queries (joining on a common column) • Appending two queries (union) • Merge join options include: Configuring relationships • Relationships enable filters on one table to appropriately propagate filter to other tables • Note: Relationships are not foreign key (integrity) constraints • They can be defined between any two tables, regardless of their data connection or source type • However, they must be based on single columns with matching data types • There is no support for self-referencing relationships Create queries Configure relationships Enhance the model Design reports Publish to Power BI Configuring relationships • Advanced options: Continued • Configure relationship advanced options to set: • Cardinality • Cross filter direction • The Both option enables many-to-many analysis • Active/inactive Tip: All options can be interpreted from symbols used in the relationship diagram Configuring relationships • Advanced options • It is possible to define multiple relationship paths between two tables • One relationship must be active, and will be used by default during model browsing • The remaining relationships are inactive, and can be used explicitly in model calculations Enhancing the model • Model tables can be extended with hierarchies, calculations, and may be hidden (i.e., not for reporting) • Model column properties can be set: • Data type • Format • Categorization • Spatial types (GPS), or web URLs • Sort order, based on another column from the same table • Default summarization (numeric, text and date columns) • Visibility Create queries Configure relationships Enhance the model Design reports Publish to Power BI Enhancing the model • Hierarchies • Hierarchies can be added to a table, enabling navigation between columns (in the one-to-many direction) Enhancing the model • Calculations • There are three different types of calculations, and each are defined by using DAX: • Calculated columns • Calculated tables • Measures • DAX = Data Analysis EXpressions (DAX), consisting of: • Excel functions (~80 functions) • Table functions • Aggregate functions • Relationship navigation functions • Context modification functions • Time Intelligence functions Enhancing the model • Calculations: Calculated columns • Define Calculated Columns to add new columns to tables • Column values for each row are materialized and stored in the model • If possible, to reduce the model size, avoid creating them • Column values are recalculated when: • The table is refreshed • Formula dependencies are refreshed Enhancing the model • Calculations: Calculated tables • Define Calculated Tables to add new tables to the model, based on existing data from other tables • Calculated tables are generally best for intermediate calculations of data stored in the model, rather than calculated on the fly • Calculated tables appear in Relationship View, and relationships can be defined with other tables • If possible, to reduce the model size, avoid creating them • Calculated tables are recalculated in the same circumstances as calculated columns Note: This is an advanced design concept Enhancing the model • Calculations: Measures • Define Measures to add aggregation logic to the model • Values are not materialized in the model • Formulas are evaluated at query time • Common aggregate functions: • Sum • Count, Distinct Count • Average • Minimum, Maximum Note: Summarizable columns are aggregated by reports—so defining measures is not necessary for simple aggregation requirements Designing reports • Reports can be designed based on the visible model interface • The design experience is almost the same as is available in the Power BI online service • Text boxes, images and shapes can be added • Text boxes can contain hyperlinks Create queries Configure relationships Enhance the model Design reports Publish to Power BI Designing reports • Custom visuals • Custom visuals can be imported to extend beyond the out-of-the-box visualizations • A gallery of visuals created by the Power BI community is available at https://app.powerbi.com/visuals • Browse through the visuals or submit one of your own for others to use • The list of available visuals is growing each month • Custom visuals will render in the Power BI service Designing reports • Custom visuals: Gallery (subset) * And the list is growing! Publishing to Power BI • The Power BI Desktop file can be uploaded to the Power BI service, or published directly • Publish directly to your workspace or a group • Note, if overwriting an existing dataset: • If there are two or more datasets with the same name, remove one, or rename the Power BI Desktop file • Renaming columns or measures can break existing reports or dashboard tiles Create queries Configure relationships Enhance the model Design reports Publish to Power BI Describing additional capabilities • Running R scripts • R scripts can be ran directly in Power BI Desktop, and resulting datasets imported into a Power BI Desktop data model • R must be installed on the local machine • Only data frames are imported • Columns typed as Complex and Vector are not imported • Can be refreshed with a gateway Describing additional capabilities • Generating R visuals • R visuals render from R script, accepting input fields • Benefits: • Leverage the voluminous and growing number of out-of-the-box plots available in R • Easily customize R visuals by developing the script • Combine advanced analytics in visuals • Interact with R visuals in Power BI Desktop (filter, and cross-filter are supported) Describing additional capabilities • Shape maps (Preview) • A Shape Map visual shows relative comparisons of regions on a map by applying different colors to different regions • In contrast to the Map visual, Shape Map cannot show precise geographical locations of data points on a map; instead, its main purpose is to show relative comparisons of regions on a map by coloring them differently Describing additional capabilities • Shape maps (Preview): Continued • Custom maps can be used provided they are in the TopoJSON format • Possibilities: Display geographical, seating arrangements, floor plans, etc. Additional Capabilities of Power BI • Exploring with Q&A • In the dashboard canvas, use Q&A to ask natural language questions • The dashboard must contain at least one tile to connect to a dataset, and • Only cloud-based cached datasets are supported • Q&A helps formulate an appropriate question • Q&A will format your question with suggestions, auto-complete, and even spelling corrections • Featured questions can be added to datasets • Only English questions are supported December 14, 2022 PA220 DB for Analytics 52 Additional Capabilities of Power BI • Exploring with Q&A • Begin by asking “Show…” • For example: • Show