Thứ Năm, 17 tháng 7, 2014

To BLOB or Not To BLOB: Large Object Storage in a Database or a Filesystem - Microsoft Research

There's a really good paper by Microsoft Research called To Blob or Not To Blob.

Their conclusion after a large number of performance tests and analysis is this:

  • if your pictures or document are typically below 256K in size, storing them in a database VARBINARY column is more efficient
  • if your pictures or document are typically over 1 MB in size, storing them in the filesystem is more efficient (and with SQL Server 2008's FILESTREAM attribute, they're still under transactional control and part of the database)
  • in between those two, it's a bit of a toss-up depending on your use

If you decide to put your pictures into a SQL Server table, I would strongly recommend using a separate table for storing those pictures - do not store the employee foto in the employee table - keep them in a separate table. That way, the Employee table can stay lean and mean and very efficient, assuming you don't always need to select the employee foto, too, as part of your queries.

For filegroups, check out Files and Filegroup Architecture for an intro. Basically, you would either create your database with a separate filegroup for large data structures right from the beginning, or add an additional filegroup later. Let's call it "LARGE_DATA".

Now, whenever you have a new table to create which needs to store VARCHAR(MAX) or VARBINARY(MAX) columns, you can specify this file group for the large data:

CREATE TABLE dbo.YourTable

     (....... define the fields here ......)

     ON Data                   -- the basic "Data" filegroup for the regular data

     TEXTIMAGE_ON LARGE_DATA   -- the filegroup for large chunks of data

Check out the MSDN intro on filegroups, and play around with it!

 

http://research.microsoft.com/apps/pubs/default.aspx?id=64525

To BLOB or Not To BLOB: Large Object Storage in a Database or a Filesystem

Russell Sears, Catharine Van Ingen, and Jim Gray
April 2006

Abstract

Application designers often face the question of whether to store large objects in a filesystem or in a database. Often this decision is made for application design simplicity. Sometimes, performance measurements are also used. This paper looks at the question of fragmentation – one of the operational issues that can affect the performance and/or manageability of the system as deployed long term. As expected from the common wisdom, objects smaller than 256K are best stored in a database while objects larger than 1M are best stored in the filesystem. Between 256K and 1M, the read:write ratio and rate of object overwrite or replacement are important factors. We used the notion of “storage age” or number of object overwrites as way of normalizing wall clock time. Storage age allows our results or similar such results to be applied across a number of read:write ratios and object replacement rates.

Details

Publication type

TechReport

Number

MSR-TR-2006-45

Pages

10

Institution

Microsoft Research

 

Không có nhận xét nào:

Đăng nhận xét

(Chơi cho vui) AIRDROP CHAINGE FINANCE - dự án xây dựng ứng dụng ngân hàng số cho mọi người

 Không hiểu lắm về cái này, tuy nhiên thấy quảng cáo khá nhiều, lại chỉ cung cấp vài thông tin cá nhân (mà mấy ông lớn như facebook với goog...