MSBI TIPS - Collection of dailly notes: Performance tuning

Showing posts with label Performance tuning. Show all posts

Wednesday, 24 February 2021

Generating Usage Statistics from a SSAS Tabular Cube

24
Aug

Once you have users accessing your cube it’s almost inevitable at some point that someone will ask you to generate usage statistics from it, and there are a number of methods to achieve this. In this quick blog post, I’ll detail them and my experiences with each, and then use this data to create a PBI report.

Native Tabular Properties

The first method is natively through the tabular cube properties. This also has the added bonus (read impact) that it will optimise future aggregations based on usage – in fact that’s its main purpose.

This can be done by setting the CreateQueryLogTable to true, setting up the QueryLogConnectionString (to point to the DB where the usage table requires hosting), setting the QueryLogSamping rate (10 means every 10th query will be logged), and finally the name of the QueryLog table.

SQL Server Analysis Services Query Log Related Properties

Advantages of this method is that its very easy to setup with limited knowledge required and it could potentially improve performance if you have an environment where users submit repetitive queries. Unfortunately there are also a number of disadvantages which led me to find other methods. Firstly, it creates a degree of overhead on the cube if its sampling too often; we actually had visible performance related complaints once we turned it on – either through the sampling or change to the “optimised” aggregations. Depending on the sampling rate, you could also find that users who rarely use the cube are not picked up as part of the stats. As well as this any changes to the cube structure will cause the logging table to be reset. The table is also limited in terms of what it actually logs (as you can see below) – useful if you just want just the user and timestamp info but not much else, and no real ability to configure.

Query the OlapQueryLog table for Cube activity

AS Trace

To that extent, I looked for other tools to do the same task but better and I found AS Trace.

Originally built for SQL Server 2012, it works fine on 2014 – and provides you the ability to run a trace against the cube activities (and log to a table) exactly like the SQL profiler but without the overhead of the GUI which adds unnecessary memory/processor power. It also runs as a windows service allowing it to restart automatically when the server reboots. If this is the case, the tool also logs the existing data to a History table and truncates the logging table. Exactly what I was after.

The tool collects information based on a preconfigured Analysis Services Profiler template, which can be optimised depending on which events you are interested in. I initially ran it using most events selected, and with a limited user set it was generating in the region of 25,000 rows a day. This was clearly not maintainable for a long period of time. I then used the following blog post to understand what each event of the profiler was giving me and then just created a lightweight trace definition file to give me what I wanted. I limited it to Query Begin, Query End (for DAX/MDX statements) and Audit Logon/Logout (for session data).

The setup is very straight forward, just run the install.bat as an escalated privileged account, and check it installs the service correctly. Next, add your SSAS service account to the Logon of the service, make sure the account has “Log on as Service” and membership to the database you are writing to in the form of DDL and DML access, i.e. able to create tables, write to tables – and lastly admin rights to the instance of SSAS you intend to use.

Next, configure the ASTrace.exe.config file with the parameters you want the tool to use. This includes the location of the cube (can handle multiple cubes), the location of the trace definition file, the location of the DB instance and table you want to log to and lastly whether you want to preserve history on restart. The only thing I couldn’t do here, is set the schema of the table it was using to log to, which defaults to dbo.

All that’s left is to start the service, and check the log file to see if it has created any errors on start-up. If not, the table should be created correctly and awaiting input.

I also saw another method while researching using Extended Events (XEvents) but did not implement this once AS Trace provided me with the information I needed.

View / Power BI Report

I initially used the data to run a limited set of queries to extract total users, and total queries for a given time period. This was useful to a degree but from the data collected I realised I could be doing so much more.

This lead me to do some analysis across the type of metrics being logged, and allowed me to create a view on top of the tables of what I thought might be useful on a report. I removed all the redundant columns it was tracking, and created some friendly names for the EventSubclass, and other columns. I used the PATINDEX function to check the query statement for existence of some important values – while not an exact science, it would give me a good picture of the split between certain user groups and KPIs being run. I’ve included the view definition below.

I ended up limiting the data to EventClass 10 as this seemed to capture all the necessary data. The only downside I have seen so far is that users querying through the Power BI web service are anonymised under the service account name. I’m currently looking into options to resolve this which I’ve seen as configuration options on Power BI – to allow through the username as long as it can be matched at the other end.

SELECT 
RowNumber AS ID, 
SPID AS SessionID,
CurrentTime AS DateQueried, 
NTUserName AS Username,
CASE EventSubClass 
WHEN 0 THEN 'MDX Query (Excel)' 
WHEN 3 THEN 'DAX Query (Power BI)' 
WHEN 1 THEN 'METADATA Query'
END AS QueryType, 
CASE Success WHEN 1 THEN 'Successful Query' ELSE 'Query Error' END AS SuccessfulQuery,
CONVERT(DECIMAL(10,2),CONVERT(DECIMAL(18,3),CPUTime)/1000) AS CPUTimeSec, 
CONVERT(DECIMAL(10,2),CONVERT(DECIMAL(18,3),Duration)/1000) AS DurationSec, 
TextData AS Query, 
CASE PATINDEX('%Mexico%',TextData) WHEN 0 THEN 0 ELSE 1 END AS MexicoMarket,
CASE PATINDEX('%Colombia%',TextData) WHEN 0 THEN 0 ELSE 1 END AS ColombiaMarket,
CASE PATINDEX('%CS4%',TextData) WHEN 0 THEN 0 ELSE 1 END AS CS4,
ServerName
FROM 
[dbo].[ASTraceTable]
WHERE
EventClass = 10

Once I had the view, creating the report was relatively straight forward, and can be seen below.

I included metrics for number of queries by user (blurred out) which also doubled as a filter, the % split of queries for things such as Excel/Power BI, a measure of queries by timeframe, a logarithmic scaled display for queries by query duration, and lastly a split of queries by KPI. I intend to tweak these once I receive more data from the trace, but was relatively happy with the information that they were providing.

Please let me know if you have any comments.

https://adatis.co.uk/generating-usage-statistics-from-a-ssas-tabular-cube/

Thursday, 27 March 2014

SQL SERVER – Clean Cache and Clean Buffer

We do get a situation where SQL Server will not be in a position to allocate memory for the new resources. At times, we have to clear up the memory to allocate space to new resources.

DBCC FREEPROCCACHE

Use DBCC FREEPROCCACHE to clear the procedure cache. Freeing the procedure cache would cause, for example, an ad-hoc SQL statement to be recompiled rather than reused from the cache. If observing through SQL Profiler, one can watch the Cache Remove events occur as DBCC FREEPROCCACHE goes to work. DBCC FREEPROCCACHE will invalidate all stored procedure plans that the optimizer has cached in memory and force SQL Server to compile new plans the next time those procedures are run.

DBCC DROPCLEANBUFFERS

Use DBCC DROPCLEANBUFFERS to test queries with a cold buffer cache without shutting down and restarting the server. DBCC DROPCLEANBUFFERS serves to empty the data cache. Any data loaded into the buffer cache due to the prior execution of a query is removed.

DBCC FREESYSTEMCACHE ('ALL')

Releases all unused cache entries from all caches. The SQL Server Database Engine proactively cleans up unused cache entries in the background to make memory available for current entries. However, you can use this command to manually remove unused entries from all caches or from a specified Resource Governor pool cache.

DBCC FREESESSIONCACHE

Flushes the distributed query connection cache used by distributed queries against an instance of Microsoft SQL Server.

To find the cached size per database

SELECT count(*)*8/1024 AS 'Cached Size (MB)' ,CASE database_id WHEN 32767 THEN 'ResourceDb' ELSE db_name(database_id) END AS 'Database'
FROM sys.dm_os_buffer_descriptors
GROUP BY db_name(database_id) ,database_id
ORDER BY 'Cached Size (MB)' DESC

To find the cached size of the plans
SELECT objtype AS 'Cached Object Type', count(*) AS 'Number of Plans', sum(cast(size_in_bytes AS BIGINT))/1024/1024 AS 'Plan Cache Size (MB)', avg(usecounts) AS 'Avg Use Count'
FROM sys.dm_exec_cached_plans
GROUP BY objtype

http://manage-sqlserver.blogspot.in/2012/08/clean-up-sql-server-memory.html

http://technet.microsoft.com/en-us/library/ms178529.aspx

http://technet.microsoft.com/en-us/library/ms187781.aspx

Tuesday, 25 March 2014

SQL Server Fragmentation - Detect and Eliminate Fragmentation

What is Fragmentation? How to detect fragmentation and how to eliminate it?

Storing data non-contiguously on disk is known as fragmentation. Before learning to eliminate fragmentation, you should have a clear understanding of the types of fragmentation. We can classify fragmentation into two types:

Internal Fragmentation: When records are stored non-contiguously inside the page, then it is called internal fragmentation. In other words, internal fragmentation is said to occur if there is unused space between records in a page. This fragmentation occurs through the process of data modifications (INSERT, UPDATE, and DELETE statements) that are made against the table and therefore, to the indexes defined on the table. As these modifications are not equally distributed among the rows of the table and indexes, the fullness of each page can vary over time. This unused space causes poor cache utilization and more I/O, which ultimately leads to poor query performance.
External Fragmentation: When on disk, the physical storage of pages and extents is not contiguous. When the extents of a table are not physically stored contiguously on disk, switching from one extent to another causes higher disk rotations, and this is called Extent Fragmentation.

Index pages also maintain a logical order of pages inside the extent. Every index page is linked with previous and next page in the logical order of column data. However, because of Page Split, the pages turn into out-of-order pages. An out-of-order page is a page for which the next physical page allocated to the index is not the page pointed to by the next-page pointer in the current leaf page. This is called Logical Fragmentation.

Ideal non-fragmented pages are given below:

Statistics for table scan are as follows:

Page read requests: 2
Extent switches: 0
Disk space used by table: 16 KB
avg_fragmentation_in_percent: 0
avg_page_space_used_in_percent: 100

Following are fragmented pages:

In this case, the statistics for table scan are as follows:

Page read requests: 6
Extent switches: 5
Disk space used by table: 48 KB
avg_fragmentation_in_percent > 80
avg_page_space_used_in_percent: 33

How to detect Fragmentation:

We can get both types of fragmentation using the DMV: sys.dm_db_index_physical_stats. For the screenshot given below, the query is as follows:

SELECT OBJECT_NAME(OBJECT_ID), index_id,index_type_desc,index_level,
avg_fragmentation_in_percent,avg_page_space_used_in_percent,page_count
FROM sys.dm_db_index_physical_stats
(DB_ID(N'AdventureWorksLT'), NULL, NULL, NULL , 'SAMPLED')
ORDER BY avg_fragmentation_in_percent DESC

Along with other information, there are two important columns that for detecting fragmentation, which are as follows:

avg_fragmentation_in_percent: This is a percentage value that represents external fragmentation. For a clustered table and leaf level of index pages, this is Logical fragmentation, while for heap, this is Extent fragmentation. The lower this value, the better it is. If this value is higher than 10%, some corrective action should be taken.
avg_page_space_used_in_percent: This is an average percentage use of pages that represents to internal fragmentation. Higher the value, the better it is. If this value is lower than 75%, some corrective action should be taken.

Reducing fragmentation:

Reducing Fragmentation in a Heap: To reduce the fragmentation of a heap, create a clustered index on the table. Creating the clustered index, rearrange the records in an order, and then place the pages contiguously on disk.
Reducing Fragmentation in an Index: There are three choices for reducing fragmentation, and we can choose one according to the percentage of fragmentation:
- If avg_fragmentation_in_percent > 5% and < 30%, then use ALTER INDEX REORGANIZE: This statement is replacement for DBCC INDEXDEFRAG to reorder the leaf level pages of the index in a logical order. As this is an online operation, the index is available while the statement is running.
- If avg_fragmentation_in_percent > 30%, then use ALTER INDEX REBUILD: This is replacement for DBCC DBREINDEX to rebuild the index online or offline. In such case, we can also use the drop and re-create index method.
- (Update: Please note this option is strongly NOT recommended)Drop and re-create the clustered index: Re-creating a clustered index redistributes the data and results in full data pages. The level of fullness can be configured by using the FILLFACTOR option in CREATE INDEX.

http://blog.sqlauthority.com/2010/01/12/sql-server-fragmentation-detect-fragmentation-and-eliminate-fragmentation/

Monday, 30 September 2013

Immediate Deadlock notifications in SQL Server

Deadlocks can be a pain to debug since they're so rare and unpredictable. The problem lies in repeating them in your dev environment. That's why it's crucial to have as much information about them from the production environment as possible.

There are two ways to monitor deadlocks, about which I'll talk about in the future posts. Those are SQL Server tracing and Error log checking. Unfortunately both of them suffer from the same thing: you don't know immediately when a deadlock occurs. Getting this info as soon as possible is sometimes crucial in production environments. Sure you can always set the trace flag 1222 on, but this still doesn't solve the immediate notification problem.

One problem for some might be that this method is only truly useful if you limit data access to stored procedures. <joke> So all you ORM lovers stop reading since this doesn't apply to you anymore! </joke>

The other problem is that it requires a rewrite of the problematic stored procedures to support it. However since SQL Server 2005 came out my opinion is that every stored procedure should have the try ... catch block implemented. There's no visible performance hit from this and the benefits can be huge. One of those benefits are the instant deadlocking notifications.

Needed "infrastructure"

So let's see how it done. This must be implemented in the database you wish to monitor of course.

First we need a view that will get lock info about the deadlock that just happened. You can read why this type of query gives info we need in my previous post.

CREATE VIEW vLocks

SELECT L.request_session_id AS SPID,

DB_NAME(L.resource_database_id) AS DatabaseName,

O.Name AS LockedObjectName,

P.object_id AS LockedObjectId,

L.resource_type AS LockedResource,

L.request_mode AS LockType,

ST.text AS SqlStatementText,

ES.login_name AS LoginName,

ES.host_name AS HostName,

TST.is_user_transaction AS IsUserTransaction,

AT.name AS TransactionName

FROM sys.dm_tran_locks L

LEFT JOIN sys.partitions P ON P.hobt_id = L.resource_associated_entity_id

LEFT JOIN sys.objects O ON O.object_id = P.object_id

LEFT JOIN sys.dm_exec_sessions ES ON ES.session_id = L.request_session_id

LEFT JOIN sys.dm_tran_session_transactions TST ON ES.session_id = TST.session_id

LEFT JOIN sys.dm_tran_active_transactions AT ON TST.transaction_id = AT.transaction_id

LEFT JOIN sys.dm_exec_requests ER ON AT.transaction_id = ER.transaction_id

CROSS APPLY sys.dm_exec_sql_text(ER.sql_handle) AS ST

WHERE resource_database_id = db_id()

Next we have to create our stored procedure template:

CREATE PROC <ProcedureName>

BEGIN TRAN

BEGIN TRY

COMMIT

END TRY

BEGIN CATCH

-- check transaction state

IF XACT_STATE() = -1

BEGIN

DECLARE @message xml

-- get our deadlock info FROM the VIEW

SET @message = '<TransactionLocks>' + (SELECT * FROM vLocks ORDER BY SPID FOR XML PATH('TransactionLock')) + '</TransactionLocks>'

-- issue ROLLBACK so we don't ROLLBACK mail sending

ROLLBACK

-- get our error message and number

DECLARE @ErrorNumber INT, @ErrorMessage NVARCHAR(2048)

SELECT @ErrorNumber = ERROR_NUMBER(), @ErrorMessage = ERROR_MESSAGE()

-- if it's deadlock error send mail notification

IF @ErrorNumber = 1205

BEGIN

DECLARE @MailBody NVARCHAR(max)

-- create out mail body in the xml format. you can change this to your liking.

SELECT @MailBody = '<DeadlockNotification>'

(SELECT 'Error number: ' + isnull(CAST(@ErrorNumber AS VARCHAR(5)), '-1') + CHAR(10) +

'Error message: ' + isnull(@ErrorMessage, ' NO error message') + CHAR(10)

FOR XML PATH('ErrorMeassage'))

CAST(ISNULL(@message, '') AS NVARCHAR(MAX))

'</DeadlockNotification>'

-- for testing purposes

-- SELECT CAST(@MailBody AS XML)

-- send an email with the defined email profile.

-- since this is async it doesn't halt execution

EXEC msdb.dbo.sp_send_dbmail

@profile_name = 'your mail profile',

@recipients = 'dba@yourCompany.com',

@subject = 'Deadlock occured notification',

@body = @MailBody;

END

END CATCH

The main part of this stored procedure is of course the CATCH block. The first line in there is check of the XACT_STATE() value. This is a scalar function that reports the user transaction state. -1 means that the transaction is uncommittable and has to be rolled back. This is the state of the victim transaction in the internal deadlock killing process. Next we read from our vLocks view to get the full info (SPID, both SQL statements text, values, etc...) about both SPIDs that created a deadlock. This is possible since our deadlock victim transaction hasn't been rolled back yet and the locks are still present. We save this data into an XML message. Next we rollback our transaction to release locks. With error message and it's corresponding number we check if the error is 1205 - deadlock and if it is we send our message in an email. How to configure database mail can be seen here.

Both the view and the stored procedures template can and probably should be customized to suit your needs.

Testing the theory

Let's try it out and see how it works with a textbook deadlock example that you can find in every book or tutorial.

-- create our deadlock table with 2 simple rows

CREATE TABLE DeadlockTest ( id INT)

INSERT INTO DeadlockTest

SELECT 1 UNION ALL

SELECT 2

Next create two stored procedures (spProc1 and spProc2) with our template:

For spProc1 replace <SPROC TEXT GOES HERE> in the template with:

UPDATE DeadlockTest

SET id = 12

WHERE id = 2

-- wait 5 secs TO SET up deadlock condition IN other window

WAITFOR DELAY '00:00:05'

UPDATE DeadlockTest

SET id = 11

WHERE id = 1

For spProc2 replace <SPROC TEXT GOES HERE> in the template with:

UPDATE DeadlockTest

SET id = 11

WHERE id = 1

-- wait 5 secs TO SET up deadlock condition IN other window

WAITFOR DELAY '00:00:05'

UPDATE DeadlockTest

SET id = 12

WHERE id = 2

Next open 2 query windows in SSMS:

In window 1 run put this script:

exec spProc1

In window 2 put this script:

exec spProc2

Run the script in the first window and after a second or two run the script in the second window. A deadlock will happen and a few moments after the victim transaction fails you should get the notification mail. Mail profile has to be properly configured of course.

The resulting email should contain an XML with full info about the deadlock. You can view it by commenting msdb.dbo.sp_send_dbmail execution and uncommenting the SELECT CAST(@MailBody AS XML) line

MSBI TIPS - Collection of dailly notes

Wednesday, 24 February 2021

Generating Usage Statistics from a SSAS Tabular Cube

Thursday, 27 March 2014

SQL SERVER – Clean Cache and Clean Buffer

Tuesday, 25 March 2014

SQL Server Fragmentation - Detect and Eliminate Fragmentation

Monday, 30 September 2013

Immediate Deadlock notifications in SQL Server

Azure AzCopy Command in Action

Report Abuse