Search

Tuesday, April 30, 2013

Cloudera’s CEO sizes up the SQL-on-Hadoop market

Cloudera’s Impala engine for interactive SQL queries on Hadoop data is now generally available, and CEO Mike Olson gives his lay of the competitive landscape.

There is no shortage of confidence in the Hadoop space, and market leader Cloudera bolstered its own on Tuesday with the general availability of its Impala SQL query engine for Hadoop. And if CEO Mike Olson’s comments are any indication, we’re in for a long ride of competitive jockeying and oneupmanship as Cloudera and its peers go all Microsoft or Google and create myriad new data-processing engines to turn their Hadoop distributions into bona fide platforms.
Launched as a private beta in May 2012 and made public in October, Impala is Cloudera’s attempt to address the growing demand for interactive SQL analytics on Hadoop data. It’s essentially a massively parallel database designed to share the same storage platform and metadata as Hadoop MapReduce, only it is its own separate processing engine.

Impala actually uses the same “nearly ANSI” version of SQL as does current standard bearer Hive, but that technology (created by Facebook in 2009 as a data warehouse layer for Hadoop) doesn’t run nearly fast enough to sate many users’ desire for interactive analytics. This is because Hive transforms SQL queries into MapReduce jobs, meaning every one is processed against the entire corpus of data in the Hadoop Distributed File System.





































Sizing up the competition

Only Cloudera isn’t the first to have the idea, nor is it alone in trying to sell interactive SQL on Hadoop. The idea was first commercialized by Boston-based startup Hadapt in 2011, and is now being pushed by numerous startups and larger Hadoop players. Among them: Pivotal (formerly EMC) Greenplum, MapR (with Drill), Hortonworks (with Stinger), Drawn to Scale, Splice Machine, Jethro Data and Citus Data.





But Cloudera is arguably the biggest name pushing SQL on Hadoop, and CEO Mike Olson thinks Impala stands out for several reasons — not the least of which is that it exists as a product. “Nobody else is shipping production-grade SQL query support on Hadoop,” he told me during a recent call. “At least not in open source.” He seems content to let the startups do their things, instead focusing his attention on Cloudera’s big three Hadoop-distribution competitors in Pivotal, MapR and Hortonworks. Greenplum and Pivotal SVP Scott Yara was full of confidence — and R&D budget– when the company announced the Pivotal HD distribution and HAWQ technology in February, but Olson claims the approach requires a siloed DBMS within HDFS and is a “rearguard defensive strategy” to protect the company’s sunk costs in its database technology.




As for Hortonworks, Olson questions the wisdom of its Stinger initiative to boost Hive’s speed, noting that “Hive never got good while it was running standalone on MapReduce.” Hortonworks also partners with vendors such as Teradata to let their platforms access Hadoop data in its native format, but those approaches still require sending data over the network. “It’s not the way you would build it if you woke up in the 2000s and were building this anew,” Olson said.






Olson acknowledged that the MapR-led Apache Drill project is cut from the same cloth as Impala (that is, being a Google Dremel clone designed specifically for Hadoop), but “the difference is we’re shipping code.” Being generally available and ready for production workloads means Cloudera can lock down users and market share before many even have a chance to experiment with Drill. He all but dismissed questions over the readiness of Impala, spurred by rumblings in the Hadoop space that Cloudera rushed it into public beta in order to get on the scoreboard against more fully baked offerings.
“I don’t feel we’re under the gun competitively to pull it out of beta because no one else has product in the market,” Olson said. “I have no problems … calling this GA quality.” He did, however, acknowledge that Impala is shipping with a “minium viable feature set” that the company has plans to build on in the near future. Impala Senior Product Manager Justin Erickson noted a few issues of concern, including around the number of concurrent users Impala can support, but said they have been addressed during the beta period.

One piece of a larger platform

Really, though, the whole point of Impala and its competitors is to turn Hadoop from a tool for batch analytics and mass storage into a platform that can handle nearly all of companies’ data-processing needs. In that regard, it appears we’re just getting started. Cloudera, MapR, Pivotal Greenplum and Hortonworks are already pushing their own products and projects, and Olson said “it’s absolutely our intent” to enhance Cloudera’s platform with even more open-source products — perhaps even more database technologies a la HBase — that will let users do more stuff with more types of data. Over time, this strategy could result in Hadoop displacing the current breed of databases and data warehouses and becoming the single data store atop of which users run whatever applications they so desire. For now, though, especially when it comes to Impala and the data warehouse incumbents, Olson is taking a measured approach. “The likelihood that we’re going to knock them off in the near term,” he said, “… it would be a tough fight to win.”

 

TSQL command to check if SQL Agent services is started or down.

SET
NOCOUNT ON

DECLARE
@ISCOUNT varchar(100)

DECLARE
@ISCOUNT2 integer

BEGIN


SELECT @ISCOUNT2 = count(1)

FROM master.sys.sysprocesses

WHERE program_name like N'%SQLAgent%'

IF @ISCOUNT2 < 0

BEGIN

SET @ISCOUNT = convert(varchar(100),'FAILURE: SQL Server Agent service is NOT started!')

END

ELSE

BEGIN

SET @ISCOUNT = convert(varchar(100),'SQL Server Agent service is started.')

END

PRINT @ISCOUNT

END

GO

Monday, April 29, 2013

Oracle Unveils SPARC Servers with the World’s Fastest Microprocessor

New SPARC Servers Redefine the Economics of Enterprise Computing: Deliver Extreme Performance and Value for Database and Enterprise Applications, Trump the Competition on Multiple Business-Critical Workloads

Redwood Shores, Calif. – March 26, 2013

News Facts

Oracle today announced a complete refresh of their midrange and high-end SPARC server lines with new SPARC T5 and M5 servers running Oracle Solaris.
Oracle’s new SPARC T5 midrange servers are based on the SPARC T5, the world’s fastest microprocessor.
Businesses today need the computing power to exploit Big Data to maximum advantage, to use analytics to discover growth and opportunities, and turn social streams into market intelligence. Traditional systems simply don’t have the horsepower to handle those demanding tasks.
Oracle’s new SPARC T5 servers have set 17 world records(1) and are the world’s best platforms for enterprise computing at any scale, delivering the best value for database and enterprise applications.
Oracle’s SPARC T5-8 is the fastest single server for Oracle Database(2).
Oracle’s SPARC T5-8 is also the fastest single server for Oracle Middleware with a 12x cost performance advantage over the IBM Power 780(3).
Oracle’s SPARC T5-8 server has a 2.5x price performance advantage over the IBM P780 3-node cluster, when including hardware and software costs(4).
Oracle's SPARC T5-8 server has a 7x price advantage over a similar IBM Power 780 configuration for database on a server-to-server basis(5).
 

Friday, April 19, 2013

Get Top Queries Tiring SQL Server CPUs


use tempdb
go
IF object_id('tempdb..##FindTopCPUQueries_set1') is not null DROP TABLE [dbo].[##FindTopCPUQueries_set1]
GO
declare @ServerTime datetime = getdate()
, @ConvertMiliSeconds bigint = 1000
, @FilterMoreThanMiliSeconds bigint = 1
, @FilterHours bigint = 2
, @execution_count bigint = 2
, @debugFlg bit = 0
if @debugFlg=1 select @ServerTime as ServerTime, @ConvertMiliSeconds as ConvertMiliSeconds
, @FilterMoreThanMiliSeconds as FilterMoreThanMiliSeconds, @FilterHours as FilterHours
, @execution_count as execution_count
select TOP 300
@@servername as servername,@ServerTime as runtime
,isnull(db_name(QueryText.dbid),'PreparedSQL') as DBName
 ,SUBSTRING(QueryText.text, (QueryStats.statement_start_offset/2)+1,
(isnull((
CASE QueryStats.statement_end_offset
 WHEN -1 THEN DATALENGTH(QueryText.text)
 WHEN 0 THEN DATALENGTH(QueryText.text)
 ELSE QueryStats.statement_end_offset
 END - QueryStats.statement_start_offset),0)/2)
 + 1) AS QueryExecuted
,total_worker_time AS total_worker_time
,QueryStats.execution_count as execution_count
,statement_start_offset,statement_end_offset
,(case when QueryText.dbid is null then OBJECT_NAME(QueryText.objectid) else OBJECT_NAME(QueryText.objectid, QueryText.dbid) end) as ObjectName
,query_hash
,plan_handle
,sql_handle
into ##FindTopCPUQueries_set1
from sys.dm_exec_query_stats as QueryStats
cross apply sys.dm_exec_sql_text(QueryStats.sql_handle) as QueryText
where QueryStats.query_hash IN
(
select QueryStatsBaseTable.query_hash
from sys.dm_exec_query_stats QueryStatsBaseTable
where last_execution_time > DATEADD(hh,-@FilterHours,GETDATE())
group by query_hash
having (sum(total_worker_time)/sum(execution_count))>@ConvertMiliSeconds and sum(execution_count)>@execution_count
)
ORDER BY total_worker_time/execution_count DESC;
if @debugFlg=1 select * from ##FindTopCPUQueries_set1 order by QueryExecuted
IF object_id('tempdb..##FindTopCPUQueries_set2') is not null DROP TABLE [dbo].[##FindTopCPUQueries_set2]
select
servername,runtime,max(DBName) as DBName,max(QueryExecuted) as QueryExecuted,(sum(total_worker_time)/sum(execution_count))/@ConvertMiliSeconds as AvgCPUTime
,sum(execution_count) as execution_count,query_hash, max(ObjectName) as ObjectName
into ##FindTopCPUQueries_set2
from ##FindTopCPUQueries_set1
group by query_hash,servername,runtime
order by AvgCPUTime desc
select * from ##FindTopCPUQueries_set2
--where QueryExecuted like 'select TOP 300%'
order by AvgCPUTime desc

Login Password Policy

Outputs a list of all the SQL logins and for each one indicates if the password policy in NOT enforced or if the password IS enforced, is the password expiration also enforced.

SET NOCOUNT ON
SELECT
name,
CASE CAST(is_policy_checked AS TINYINT) + CAST(is_expiration_checked AS TINYINT)
WHEN 0 THEN 'Not Enforced'
WHEN 1 THEN 'Password - No Expiration'
WHEN 2 THEN 'Password With Expiration'
END AS PasswordEnforcement ,
LOGINPROPERTY(name,'BadPasswordCount') AS BadPasswordCount,
LOGINPROPERTY(name,'BadPasswordTime') AS BadPasswordTime,
LOGINPROPERTY(name,'DaysUntilExpiration') AS DaysUntilExpiration,default_database_name,
CASE WHEN LOGINPROPERTY(name,'IsExpired') = 0 THEN 'NO' ELSE 'YES' END AS IsExpired,
CASE WHEN LOGINPROPERTY(name,'IsLocked') = 0 THEN 'NO' ELSE 'YES' END AS IsLocked,
CASE WHEN LOGINPROPERTY(name,'IsMustChange') = 0 THEN 'NO' ELSE 'YES' END AS IsMustChange,
LOGINPROPERTY(name,'LockoutTime') AS LockoutTime,
LOGINPROPERTY(name,'PasswordLastSetTime') AS PasswordLastSetTime
FROM sys.sql_logins
ORDER BY name

Get space used by queries in tempdb

USE tempdb
Go
SELECT t.text,
      SUM(internal_objects_alloc_page_count) AS task_internal_objects_alloc_page_count,
      SUM(internal_objects_dealloc_page_count) AS task_internal_objects_dealloc_page_count
    FROM sys.dm_db_task_space_usage  st
 JOIN sys.sysprocesses sp
  ON sp.spid = st.session_id
 CROSS APPLY sys.dm_exec_sql_text(sp.sql_handle) t
    GROUP BY t.text
 ORDER BY 2 DESC

How much space could you get shrinking all databases

This is script is useful when you are short of space on a drive and you need to know how much space can you get shrinking your databases. Although is shrinking is not a recommended practice some times you need to get free space in your drive. Just execute the script and it will tell you the free space you can obtain.

You gan modify the line where[UsageType] = 'data' to specify log or data files.

As well you can uncomment the line --where MB_Free < 5000 to specify drives with less than 'x mbytes'.


begin
   
    set nocount on
    if exists (select 1
               from   tempdb..sysobjects
               where  [Id] = object_id('tempdb..#dbfileinfo'))
      begin
          drop table #dbfileinfo
      end
    if exists (select 1
               from   tempdb..sysobjects
               where  [Id] = object_id('tempdb..#logsizestats'))
      begin
          drop table #logsizestats
      end
    if exists (select 1
               from   tempdb..sysobjects
               where  [Id] = object_id('tempdb..#datafilestats'))
      begin
          drop table #datafilestats
      end
    if exists (select 1
               from   tempdb..sysobjects
               where  [Id] = object_id('tempdb..#fixeddrives'))
      begin
          drop table #fixeddrives
      end
    if exists (select 1
               from   tempdb..sysobjects
               where  [Id] = object_id('tempdb..#usados'))
      begin
          drop table #usados
      end
    create table #fixeddrives
      (
         DriveLetter varchar(10),
         MB_Free     dec(20, 2)
      )
    create table #datafilestats
      (
         DBName       varchar(255),
         DBId         int,
         FileId       tinyint,
         [FileGroup]  tinyint,
         TotalExtents dec(20, 2),
         UsedExtents  dec(20, 2),
         [Name]       varchar(255),
         [FileName]   varchar(400)
      )
    create table #logsizestats
      -- DBCC SQLPERF -- Provides statistics about how the transaction-log space was used in all databases. It can also be used to reset wait and latch statistics.
       (
         DBName      varchar(255) not null primary key clustered,
         -- Database Name -- Name of the database for the log statistics displayed.
          DBId        int,
         LogFile     real,
         -- Log Size (MB) -- Actual amount of space available for the log. This amount is smaller than the amount originally allocated for log space because the SQL Server 2005 Database Engine reserves a small amount of disk space for internal header information.
          LogFileUsed real,
         -- Log Space Used (%) -- Percentage of the log file currently occupied with transaction log information.
          Status      bit
      ) -- Status -- Status of the log file. Always 0.
    create table #dbfileinfo
      (
         [ServerName]        varchar(255),
         [DBName]            varchar(65),
         [LogicalFileName]   varchar(400),
         [UsageType]         varchar (30),
         [Size_MB]           dec(20, 2),
         [SpaceUsed_MB]      dec(20, 2),
         [MaxSize_MB]        dec(20, 2),
         [NextAllocation_MB] dec(20, 2),
         [GrowthType]        varchar(65),
         [FileId]            smallint,
         [GroupId]           smallint,
         [PhysicalFileName]  varchar(400),
         [DateChecked]       datetime
      )
    declare @SQLString varchar(3000)
    declare @MinId int
    declare @MaxId int
    declare @DBName varchar(255)
    declare @tblDBName table (
      RowId  int identity(1, 1),
      DBName varchar(255),
      DBId   int)
    insert into @tblDBName
                (DBName,
                 DBId)
    select [Name],
           DBId
    from   master..sysdatabases
    where  ( Status & 512 ) = 0 /*NOT IN (536,528,540,2584,1536,512,4194841)*/
    order  by [Name]
    insert into #logsizestats
                (DBName,
                 LogFile,
                 LogFileUsed,
                 Status)
    exec ('dbcc sqlperf(logspace) with no_infomsgs')
    update #logsizestats
    set    DBId = db_id(DBName)
    insert into #fixeddrives
    exec master..xp_fixeddrives
    select @MinId = min(RowId),
           @MaxId = max(RowId)
    from   @tblDBName
    while ( @MinId <= @MaxId )
      begin
          select @DBName = [DBName]
          from   @tblDBName
          where  RowId = @MinId
          select @SQLString = 'SELECT ServerName = @@SERVERNAME,' +
                              ' DBName = ''' +
                              @DBName +
                                     ''',' +
                                                  ' LogicalFileName = [name],' +
       ' UsageType = CASE WHEN (64&[status])=64 THEN ''Log'' ELSE ''Data'' END,'
                    +
                          ' Size_MB = [size]*8/1024.00,' +
                   ' SpaceUsed_MB = NULL,' +
-- 20081125 Arithmetic overflow error converting expression to data type int.
-- ' MaxSize_MB = CASE [maxsize] WHEN -1 THEN -1 WHEN 0 THEN [size]*8/1024.00 ELSE maxsize*8/1024.00 END,'+
 ' MaxSize_MB = CASE [maxsize] WHEN -1 THEN -1 WHEN 0 THEN [size]*8/1024.00 ELSE maxsize/1024.00*8 END,'
 +
-- 20081125 end
' NextExtent_MB = CASE WHEN (1048576&[status])=1048576 THEN ([growth]/100.00)*([size]*8/1024.00) WHEN [growth]=0 THEN 0 ELSE [growth]*8/1024.00 END,'
 +
' GrowthType = CASE WHEN (1048576&[status])=1048576 THEN ''%'' ELSE ''Pages'' END,'
 + ' FileId = [fileid],' + ' GroupId = [groupid],' +
             ' PhysicalFileName= [filename],' +
             ' CurTimeStamp = GETDATE()' +
-- 20081125 begin @DBName embedded spaces
-- 'FROM '+@DBName+'..sysfiles' 
'FROM [' + @DBName + ']..sysfiles'
    -- 20081125 end
    print @SQLString
    insert into #dbfileinfo
    exec (@SQLString)
    update #dbfileinfo
    -- 20081125 begin LogFileUsed is %
    -- SET SpaceUsed_MB = (SELECT LogFileUsed FROM #LogSizeStats WHERE DBName = @DBName)
     set    SpaceUsed_MB = Size_MB / 100.0 * (select LogFileUsed
                                             from   #logsizestats
                                             where  DBName = @DBName)
    -- 20081125 end
    where  UsageType = 'Log'
           and DBName = @DBName
    -- 20081125 begin @DBName embedded spaces
    -- SELECT @SQLString = 'USE ' + @DBName + ' DBCC SHOWFILESTATS WITH NO_INFOMSGS'
     select @SQLString = 'USE [' + @DBName +
                        '] DBCC SHOWFILESTATS WITH NO_INFOMSGS'
    -- 20081125 end
    insert #datafilestats
           (FileId,
            [FileGroup],
            TotalExtents,
            UsedExtents,
            [Name],
            [FileName])
    execute(@SQLString)
    update #dbfileinfo
    set    [SpaceUsed_MB] = S.[UsedExtents] * 64 / 1024.00
    from   #dbfileinfo as F
           inner join #datafilestats as S
             on F.[FileId] = S.[FileId]
                and F.[GroupId] = S.[FileGroup]
                and F.[DBName] = @DBName
    truncate table #datafilestats
    select @MinId = @MinId + 1
end
    select @@servername                            as servidor,
           substring(A.PhysicalFileName, 1, 1)     as unidad,
           sum ([Size_MB])                         as SqlTotalDB,
           sum([SpaceUsed_MB])                     as SqlTotalUsedSpaceDB,
           sum (( [Size_MB] ) - ( [SpaceUsed_MB] ))as SQLTotalFreeSpaceDB
    into   #usados
    from   #dbfileinfo as A
           left join #fixeddrives as B
             on substring(A.PhysicalFileName, 1, 1) = B.DriveLetter
    --ORDER BY CAST([NextAllocation_MB] AS decimal(6,2)) 
    --ORDER BY [FreeSpaceInDrive] 
    --where ([Size_MB] - [SpaceUsed_MB]) -  ([NextAllocation_MB]) < 0
    --where (B.MB_Free) + (([Size_MB] - [SpaceUsed_MB]) -  ([NextAllocation_MB])) < 2000
     --where (B.MB_Free) + (([Size_MB] - [SpaceUsed_MB]) -  ([NextAllocation_MB])) <= 0
     --where CONVERT(sysname,DatabasePropertyEx([DBName],'Recovery')) = 'FULL'
    --where [MaxSize_MB] <> 2097152.00 and [MaxSize_MB] <> -1.00
    --where left ([PhysicalFileName],1) = 'C'
    --where (CAST(([Size_MB] - [SpaceUsed_MB]) / [Size_MB] AS decimal(4,2))) > 0.5 and ([Size_MB]) > 5000 --mas de la mitad de espacio libre
     --where ([Size_MB]) > 100000 --TAMA?O DE LA BASE DE DATOS
    --where [DBName] like '%content%'
    where[UsageType] = 'data'
    --where [NextAllocation_MB] < 5 and [SpaceUsed_MB] > 20 and [dbname] not in ('master','model','msdb','tempdb') -- bases de datos grandes con poco crecimiento
     --where [NextAllocation_MB] > 500 -- bases de datos con crecimiento desmesurado
     --where [Size_MB] - [SpaceUsed_MB] < 50 and [UsageType] = 'log'
    group  by substring(A.PhysicalFileName, 1, 1)
   
    select
           sum(SQLTotalFreeSpaceDB)
    
    from   #fixeddrives as f
           inner join #usados as z
             on z.unidad = f.DriveLetter
   
 --where MB_Free < 5000   

    --select * from #usados
    --ORDER BY ([Size_MB] - [SpaceUsed_MB]) -  ([NextAllocation_MB])
    --ORDER BY CAST([NextAllocation_MB] AS decimal(6,2)) 
    if exists (select 1
               from   tempdb..sysobjects
               where  [Id] = object_id('tempdb..#dbfileinfo'))
      begin
          drop table #dbfileinfo
      end
    if exists (select 1
               from   tempdb..sysobjects
               where  [Id] = object_id('tempdb..#logsizestats'))
      begin
          drop table #logsizestats
      end
    if exists (select 1
               from   tempdb..sysobjects
               where  [Id] = object_id('tempdb..#datafilestats'))
      begin
          drop table #datafilestats
      end
    if exists (select 1
               from   tempdb..sysobjects
               where  [Id] = object_id('tempdb..#fixeddrives'))
      begin
          drop table #fixeddrives
      end
    if exists (select 1
               from   tempdb..sysobjects
               where  [Id] = object_id('tempdb..#usados'))
      begin
          drop table #usados
      end
    -- 20081125 begin SET NOCOUNT OFF
    set nocount off
-- 20081125 end
end 

List Database Users and their Corresponding Roles

This script will generate list of users and their corresponding database, application roles and type of login for a given database.

WITH Roles_CTE(Role_Name, Username)
AS
(
 SELECT
  User_Name(sm.[groupuid]) as [Role_Name],
  user_name(sm.[memberuid]) as [Username]
 FROM [sys].[sysmembers] sm
)
SELECT 
    Roles_CTE.Role_Name,
    [DatabaseUserName] = princ.[name],
    [UserType] = CASE princ.[type]
                    WHEN 'S' THEN 'SQL User'
                    WHEN 'U' THEN 'Windows User'
                    WHEN 'G' THEN 'Windows Group'
                    WHEN 'A' THEN 'Application Role'
                    WHEN 'R' THEN 'Database Role'
                    WHEN 'C' THEN 'User mapped to a certificate'
                    WHEN 'K' THEN 'User mapped to an asymmetric key'
                 END
FROM
    sys.database_principals princ
JOIN Roles_CTE on Username = princ.name
where princ.type in ('S', 'U', 'G', 'A', 'R', 'C', 'K')
ORDER BY princ.name

Get Detach or Attach all user databases script

If you want detach script, run the above script with @DetachOrAttach = 1. Run with 0 for attach script.

Script will print detach script or attach script for all user databases. You can copy the script from the "messages" screen in ssms and execute to detach/attach. If you want to detach or attach specific databases you can modify the cursor sql accordingly.

USE [master];
GO
DECLARE @database NVARCHAR(200) ,
    @cmd NVARCHAR(1000) ,
    @detach_cmd NVARCHAR(4000) ,
    @attach_cmd NVARCHAR(4000) ,
    @file NVARCHAR(1000) ,
    @i INT ,
    @DetachOrAttach BIT;
SET @DetachOrAttach = 1;
-- 1 Detach 0 - Attach
-- 1 Generates Detach Script
-- 0 Generates Attach Script
DECLARE dbname_cur CURSOR STATIC LOCAL FORWARD_ONLY
FOR
    SELECT  RTRIM(LTRIM([name]))
    FROM    sys.databases
    WHERE   database_id > 4;
 -- No system databases
OPEN dbname_cur
FETCH NEXT FROM dbname_cur INTO @database
WHILE @@FETCH_STATUS = 0
    BEGIN
        SELECT  @i = 1;
        SET @attach_cmd = '-- ' + QUOTENAME(@database) + CHAR(10)
            + 'EXEC sp_attach_db @dbname = ''' + @database + '''' + CHAR(10);
      -- Change skip checks to false if you want to update statistics before you detach.
        SET @detach_cmd = '-- ' + QUOTENAME(@database) + CHAR(10)
            + 'EXEC sp_detach_db @dbname = ''' + @database
            + ''' , @skipchecks = ''true'';' + CHAR(10);
      -- Get a list of files for the database
        DECLARE dbfiles_cur CURSOR STATIC LOCAL FORWARD_ONLY
        FOR
            SELECT  physical_name
            FROM    sys.master_files
            WHERE   database_id = DB_ID(@database)
            ORDER BY [file_id];
        OPEN dbfiles_cur
        FETCH NEXT FROM dbfiles_cur INTO @file
        WHILE @@FETCH_STATUS = 0
            BEGIN
                SET @attach_cmd = @attach_cmd + '    ,@filename'
                    + CAST(@i AS NVARCHAR(10)) + ' = ''' + @file + ''''
                    + CHAR(10);
                SET @i = @i + 1;
                FETCH NEXT FROM dbfiles_cur INTO @file
            END
        CLOSE dbfiles_cur;
        DEALLOCATE dbfiles_cur;
        IF ( @DetachOrAttach = 0 )
            BEGIN
            -- Output attach script
                PRINT @attach_cmd;
            END
        ELSE -- Output detach script
            PRINT @detach_cmd;
        FETCH NEXT FROM dbname_cur INTO @database
    END
CLOSE dbname_cur;
DEALLOCATE dbname_cur;

Thursday, April 18, 2013

Full Text Search Service Architecture


 
SQL Server Full Text Search service is a specialized indexing and querying service for unstructured text stored in SQL Server databases. The full text search index can be created on any column with character based text data. It allows for words to be searched for in the text columns. While it can be performed with the SQL LIKE operator, using SQL Server Full Text Search service can be more efficient. Full allows for inexact matching of the source string, indicated by a Rank value which can range from 0 to 1000 - a higher rank means a more accurate match. It also allows linguistic matching ("inflectional search"), i.e., linguistic variants of a word (such as a verb in a different tense) will also be a match for a given word (but with a lower rank than an exact match). Proximity searches are also supported, i.e., if the words searched for do not occur in the sequence they are specified in the query but are near each other, they are also considered a match. T-SQL exposes special operators that can be used to access the FTS capabilities.
The Full Text Search engine is divided into two processes - the Filter Daemon process (msftefd.exe) and the Search process (msftesql.exe). These processes interact with the SQL Server. The Search process includes the indexer (that creates the full text indexes) and the full text query processor. The indexer scans through text columns in the database. It can also index through binary columns, and use iFilters to extract meaningful text from the binary blob (for example, when a Microsoft Word document is stored as an unstructured binary file in a database). The iFilters are hosted by the Filter Daemon process. Once the text is extracted, the Filter Daemon process breaks it up into a sequence of words and hands it over to the indexer. The indexer filters out noise words, i.e., words like A, And etc., which occur frequently and are not useful for search. With the remaining words, an inverted index is created, associating each word with the columns they were found in. SQL Server itself includes a Gatherer component that monitors changes to tables and invokes the indexer in case of updates
When a full text query is received by the SQL Server query processor, it is handed over to the FTS query processor in the Search process. The FTS query processor breaks up the query into the constituent words, filters out the noise words, and uses an inbuilt thesaurus to find out the linguistic variants for each word. The words are then queried against the inverted index and a rank of their accurateness is computed. The results are returned to the client via the SQL Server process.

Restore SQL Database from MDF file ( without LDF file)

In the below script I have created the database, create a table in that, dropped its log file and created the database with the .mdf file.

-- created database with .mdf and .ldf file

CREATE DATABASE [TEST]
ON  PRIMARY
( NAME = N'TEST', FILENAME = N'C:\TEST_Data.mdf')
 LOG ON
( NAME = N'TEST_log', FILENAME = N'C:\TEST_log.ldf')
GO
-- inserting data into database
use TEST
go
CREATE TABLE customer
(    customer_id int not null,
     customer_name    varchar(50)    not null,
     address    varchar(50),   
     city    varchar(50),   
     state    varchar(25),   
     zip_code    varchar(10),   
)   
-- inserting records
insert into customer values(1,'John','9290-9300 Transit Road','Amherst','NY','14051')
insert into customer values(2,'Sam','4030 Maple Ave.','Amherst','NY','14051')
insert into customer values(3,'Jason','4888 State Route 30','Amherst','NY','14051')
insert into customer values(4,'Joe','1651 Clark Street','Amherst','NY','14051')
go
-- Selecting Data and verifying Data is inserted
select * from TEST..customer
-- deleting the log file
-- detaching the database file
USE [master]
GO
EXEC master.dbo.sp_detach_db @dbname = N'TEST'
GO
-- now next step is delete the file manually or you can do it from command prompt
EXEC xp_cmdshell 'del C:\Test_log.ldf'
EXEC xp_cmdshell 'dir C:\Test_log.ldf'
-- script to attach the database
USE [master]
GO
CREATE DATABASE TEST ON
( FILENAME = N'C:\TEST_Data.mdf' )
FOR ATTACH
 GO

Calculates Growth Info for All Databases

SET NOCOUNT ON
/*
Desc:  Calculates Growth Info for all databases on a server that are being backed up. Relies on the backup tables, and as a result will only contain as many
  days history as do the backup tables(@iNoSamples). If a database is not being backup up the results will be NULL. (For example the Tempdb)
*/
CREATE TABLE ##tbl_DataSize
  (
  Size DECIMAL(20)
  )
CREATE TABLE #tbl_GrowthData
  (
   DatabaseName     VARCHAR(50)
  ,NoSampleDays     DECIMAL(20,3)
  ,DataSizeMB     DECIMAL(20,3)
  ,LogSizeMB     DECIMAL(20,3)
  ,BackupSizeMB     DECIMAL(20,3)
  ,TotalSpaceMB     DECIMAL(20,3)
  ,DataGrowth     DECIMAL(20,3)
  ,LogGrowth     DECIMAL(20,3)
  ,GrowthPercentage    DECIMAL(20,3)
  )
DECLARE
  @iNoSamples  INT
 ,@nMaxBackupSize DECIMAL
 ,@nMinBackupSize DECIMAL
 ,@nMaxLogSize  DECIMAL
 ,@nMinLogSize  DECIMAL
 ,@nMaxDataSize  DECIMAL
 ,@nMinDataSize  DECIMAL
 ,@vcDatabaseName VARCHAR(50)
 ,@dtMaxBackupTime DATETIME
 ,@dtMinBackupTime DATETIME
 ,@iMinBackupID  INT
 ,@iMaxBackupID  INT

DECLARE file_cursor CURSOR FOR
SELECT [name] FROM master.dbo.sysdatabases
ORDER BY [name]
OPEN file_cursor
   FETCH NEXT FROM file_cursor
   INTO @vcDatabaseName
WHILE @@FETCH_STATUS = 0
BEGIN 
SET @dtMaxBackupTime = (SELECT MAX(backup_finish_date)FROM msdb.dbo.backupset WHERE database_name = @vcDatabaseName AND [type] = 'D')
SET @dtMinBackupTime = (SELECT MIN(backup_finish_date)FROM msdb.dbo.backupset WHERE database_name = @vcDatabaseName AND [type] = 'D')
SET @iNoSamples =
 DATEDIFF
  (
    dd
   ,@dtMinBackupTime
   ,@dtMaxBackupTime
  )
SET @nMaxBackupSize = (SELECT backup_size FROM msdb.dbo.backupset WHERE database_name = @vcDatabaseName AND [type] = 'D' AND backup_finish_date = @dtMaxBackupTime)
SET @nMinBackupSize = (SELECT backup_size FROM msdb.dbo.backupset WHERE database_name = @vcDatabaseName AND [type] = 'D' AND backup_finish_date = @dtMinBackupTime)
SET @iMaxBackupID = (SELECT MAX(backup_set_id) FROM msdb.dbo.backupset WHERE database_name = @vcDatabaseName AND [type] = 'D' AND backup_finish_date = @dtMaxBackupTime)
SET @iMinBackupID = (SELECT MAX(backup_set_id) FROM msdb.dbo.backupset WHERE database_name = @vcDatabaseName AND [type] = 'D' AND backup_finish_date = @dtMinBackupTime)
SET @nMaxLogSize = (SELECT ((CAST((SUM(file_size)) AS DECIMAL(20,3))) /  1048576) FROM msdb.dbo.backupfile WHERE backup_set_id = @iMaxBackupID AND file_type = 'L')
SET @nMinLogSize = (SELECT ((CAST((SUM(file_size)) AS DECIMAL(20,3))) /  1048576) FROM msdb.dbo.backupfile WHERE backup_set_id = @iMinBackupID AND file_type = 'L')
SET @nMaxDataSize = (SELECT ((CAST((SUM(file_size)) AS DECIMAL(20,3))) /  1048576) FROM msdb.dbo.backupfile WHERE backup_set_id = @iMaxBackupID AND file_type = 'D')
SET @nMinDataSize = (SELECT ((CAST((SUM(file_size)) AS DECIMAL(20,3))) /  1048576) FROM msdb.dbo.backupfile WHERE backup_set_id = @iMinBackupID AND file_type = 'D')
EXEC ('
INSERT INTO ##tbl_DataSize
SELECT CAST((SUM(size)) as DECIMAL(20,3)) FROM '+@vcDatabaseName+'.dbo.sysfiles'
)
INSERT INTO #tbl_GrowthData
SELECT
  @vcDatabaseName DatabaseName
 ,@iNoSamples NoSampleDays
 ,@nMaxDataSize
 ,@nMaxLogSize
 ,@nMaxBackupSize / 1048576
 ,((size * 8192) / 1048576) TotalSpaceUsed 
 ,@nMaxDataSize - @nMinDataSize
 ,@nMaxLogSize  - @nMinLogSize
 ,(((@nMaxDataSize + @nMaxLogSize) - (@nMinDataSize+ @nMinLogSize)) / (@nMinDataSize+ @nMinLogSize)) * 100.00
 --growth percentage is calculated based upon the original data size, before the growth. as a result it may look a little funny, but it is accurate. or at least I think so :)
FROM ##tbl_DataSize
 TRUNCATE TABLE ##tbl_DataSize
   FETCH NEXT FROM file_cursor
   INTO @vcDatabaseName
END
CLOSE file_cursor
DEALLOCATE file_cursor
SELECT
 *
FROM #tbl_GrowthData
DROP TABLE ##tbl_DataSize
DROP TABLE #tbl_GrowthData
SET NOCOUNT OFF
 

What is importance of Resource DB in SQL Server?


What is Resource Database and what is holds ?

The resource database (mssqlsystemresource) database is the new system databases available in SQL Server 2005 and above.
The Resource database is a read-only, hidden system database that contains all the system objects that are included with SQL Server. SQL Server system objects, such as sys.objects, are physically persisted in the Resource database, but they logically appear in the sys schema of every database. The Resource database does not contain user data or user metadata.

Why Resource Database is important?

The Resource database appears to be a critical system database as the SQL Server service is now dependent on this.

What are the Advantages of Resource Database?

Prior to SQL Server 2005, whenever service packs are applied all the system objects that are residing within the system and user databases gets updated which makes it very difficult to rollback the changes.

 •The only way to rollback the changes is to uninstall the instance of SQL Server and reinstall SQL Server followed by applying any Service Packs or Hotfixes to revert back to the previous version of the SQL Server Instance and attached all User databases.

•In SQL Server 2005 onwards the changes will be made to the Resource Database, which will indeed reflect the changes in all the system and user database of the particular instance

•If the DBA needs to apply a Service Pack to multiple instances, then the mssqlsystemresource.mdf and mssqlsystemresource.ldf needs to be copied by the DBA to the target instance

•If the DBA needs to rollback the changes performed by the Service Packs then they need to replace the Resource Database related MDF and LDF files with that of the previous backed up version

Here is some information about the resource database.

•This hidden database stores system stored procedures and functions

•This was created by Microsoft in order to facilitate faster upgrades. If the system objects were to be upgraded in any Service Pack, just replacing the resource database is enough.

•This database does not list when sp_helpdb is executed.

•Since resource database is hidden, it cannot be backed up using the Backup Database command. The only way to backup this database is to copy the .mdf & .ldf files. Same holds good for restoring this database. If resource database is restored, all the service packs/hotfixes need to be  reapplied.

•The database id 32767 is reserved for resource database and cannot be changed. If in SQL Server 2000, if any database is allocated the ID 32767, the upgrade to SQL Server 2005 will fail.

•If the files are renamed or moved from their respective locations then SQL Server will not start

•The physical file names of the Resource database is mssqlsystemresource.mdf and mssqlsystemresource.ldf, which are specific to each SQL Instance.

•The location of mssqlsystemresource.mdf/.ldf files are different for SQL Server 2005 & SQL Server 2008. ◦In SQL Server 2005 these files are located at C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data\ folder. These files must be in the same folder where the data files of master database are stored.

◦In SQL Server 2008 the default location of these files is C:\Program Files\Microsoft SQL Server\MSSQL10.\MSSQL\Binn. These files need not be on the same directory where master database files are located.

◦In a clustered environment, the Resource database exists in the \Data folder on a shared disk drive.

How to Backup Resource Database ?

In order to take the backup of Resource Database we need to perform a file based or a disk based backup of the Mssqlsystemresource.mdf and Mssqlsystemresource.ldf files, by copy paste method. There is no Full backup or online backup for this database.

SQL Server will allow Resource Database related MDF & LDF files to be copied even when SQL Server is running.

How to Restore Resource Database ?

Use Copy Paste Method only, overwrite the existing files from the backup location.

How to check Current Version of resourcedb and when the last updated time of ResourceDB?

Use the following TSQL Code, to get this info

SELECT SERVERPROPERTY('ResourceVersion') Resource_DB_Version,
SERVERPROPERTY('ResourceLastUpdateDateTime') Resource_DB_LastUpdate_DateTime
GO


Notes:

1.Microsoft Direct on Resource Database (http://msdn.microsoft.com/en-us/library/ms190940.aspx)

2.Do not put the Resource Database files in a compressed or encrypted NTFS file system folders as it will effect performance and will also possibly  prevent upgrades.
 

Wednesday, April 17, 2013

SQL Server 2008 Enterprise and Standard Feature Compare

Detailed Feature ComparisonSQL Server 2008 EnterpriseSQL Server 2008 Standard
Scalability & Performance
Number of CPUsOS Maximum4
x32 hardware supportAvailableAvailable
x64 hardware supportAvailableAvailable
I64 hardware supportAvailable Not Available
Multi-instance support50 instances16 Instances
50 instancesAvailableAvailable
16 instancesAvailable Not Available
Hypervisor supportAvailable Not Available
Dynamic address windows extensions memory managementAvailable Not Available
Table and index partitioningAvailable Not Available
Parallel index operationsAvailable Not Available
Parallel consistency checks (DBCC)Available Not Available
Database snapshotsAvailable Not Available
Scalable shared databasesAvailable Not Available
Indexed viewsAvailable Not Available
Enhanced read-ahead and scanAvailable Not Available
High Availability (Always On)
Log Stream CompressionAvailableAvailable
Clustering16-node failover
clustering
2-node failover
clustering
Database mirroringFullSingle-thread
Online operationsAvailable Not Available
Resource governorAvailable Not Available
Backup compressionAvailable Not Available
Hot-add memory and CPU supportAvailable Not Available
Database snapshotsAvailable Not Available
Fast recoveryAvailable Not Available
Online page and file restoreAvailable Not Available
Parallel index operationsAvailable Not Available
Updated distributed partioned viewsAvailable Not Available
Table and index partitioningAvailable Not Available
Enterprise Security
Data encryptionTransparent Data EncryptionLimited
Data encryptionAdvanced third-party key managementLimited
AuditingSecurity auditingC2 compliant tracing
Data Warehousing
Partitioned table parallelismAvailable Not Available
Data compressionAvailable Not Available
Star join query optimizationsAvailable Not Available
Change data capture (CDC)Available Not Available
Business Intelligence
Analysis Services
Analysis Services backupAvailableAvailable
Dimension, Attribute Relationship, Aggregate & Cube DesignAvailableAvailable
Personalization extensionsAvailableAvailable
Analysis Services scalable shared databasesAvailable Not Available
Account intelligenceAvailable Not Available
Linked measures and dimensionsAvailable Not Available
PerspectivesAvailable Not Available
Semi-additive measuresAvailable Not Available
Writeback dimensionsAvailable Not Available
Proactive cachingAvailable Not Available
Auto parallel partition processingAvailable Not Available
Partitioned cubes and distributed partitioned cubesAvailable Not Available
Data Mining
Comprehensive set of DM algorithmsAvailableAvailable
Integrated Data Mining Tools (Wizards, Editors, Model Viewers, Query Builder)AvailableAvailable
Excel 2007 and Visio 2007 Add-in SupportAvailableAvailable
Parallel model processingAvailable Not Available
Support for Plug-in AlgorithmsAvailable Not Available
Advanced configuration and tuning options for Data Mining algorithmsAvailable Not Available
Support for Pipeline Data Mining and Text Mining with Integration ServicesAvailable Not Available
Integration Services
SQL Server import and export wizardAvailableAvailable
SSIS run-timeAvailableAvailable
Basic data profiling toolsAvailableAvailable
SSIS Package Designer & ServiceAvailableAvailable
Data Mining query transformationAvailable Not Available
Data Mining model training destination adapterAvailable Not Available
Fuzzy grouping transformationAvailable Not Available
Fuzzy lookup transformationAvailable Not Available
Term extraction transformationAvailable Not Available
Term lookup transformationAvailable Not Available
Dimension processing destination adapterAvailable Not Available
Partition processing destination adapterAvailable Not Available
Reporting Services
Microsoft Office integrationAvailableAvailable
Built-in forms authenticationAvailableAvailable
Report server application embeddingAvailableAvailable
Ad-hoc reporting (Report Builder)AvailableAvailable
Scale-out operational report configurationAvailable Not Available
Data-driven report subscriptionsAvailable Not Available
Report scale-out deploymentAvailable Not Available
Infinite click-through in ad-hoc reportsAvailable Not Available
Enterprise Manageability
Policy-Based ManagementAvailableAvailable
Plan FreezingAvailableAvailable
Streamlined InstallationAvailableAvailable
Performance Data CollectionAvailableAvailable
Standard Performance ReportsAvailableAvailable
SQL Server Management Studio ToolsAvailableAvailable
Auto server group managementAvailable Not Available
Mirrored backup mediaAvailable Not Available
Multi-server administrationAvailable Not Available
VIA protocol supportAvailable Not Available
Unlimited virtualizationAvailable Not Available
Programmability
Common Language Runtime (CLR) IntegrationAvailableAvailable
Native XML SupportAvailableAvailable
XML IndexingAvailableAvailable
Spatial SupportAvailableAvailable
Filestream SupportAvailableAvailable
 
SQL Server 2008 EnterpriseSQL Server 2008 Enterprise is a comprehensive data platform for running mission critical online transaction processing (OLTP), data warehousing (DW) and business intelligence (BI) applications. SQL Server 2008 Enterprise provides enterprise class scalability, high availability and security to meet the high demands of these applications.
Enterprise is ideally suited for the following usage scenarios:
- Mission critical deployments requiring high availability and uptime
- Existing large scale OLTP deployments
- OLTP deployments that expect to grow rapidly in the future
- Large scale reporting and analysis deployments
- Data Warehousing
- Server Consolidation