BENCHMARKING RELATIONAL DATABASE SYSTEMS - OVERVIEW

 Jia-Lang Seng   
 
Department of Management Information Systems, National Cheng Chi University 
Wen Shan, Taipei, 116, Taiwan ROC 

------------------------------------------------------------------------------

Abstract

 
Benchmarks are the vital tools in the performance evaluation 
and measurement  of relational  database management  systems 
(RDBMS).  Standard  benchmarks  such  as the Wisconsin, TP1, 
TPC-A, TPC-B, TPC-C, and AS3AP benchmarks  have been used to 
assess the performance of RDBMS software.  A wide variety of 
users  have been dependent  upon these benchmarks  to select 
systems, to determine  bottlenecks, and to verify technology 
improvement.  However, in our country, database benchmarking 
n.  We conclude  the paper  with  a discussion  of the  open 
research issues in benchmarking database systems. 
 
Key words: 
        Database Benchmark, Database Management System, 
        Performance Measurement Evaluation and Tuning, 
        Workload Characterization and Modeling

 
 
1.  Introduction 
 
Benchmarks  are  the crucial  tools  in the database  system 
performance   measurement  and  evaluation.   After  over  a 
decade's  evolution, database  benchmarks  have emerged  and 
become a more mature and important paradigm in analyzing and 
comparing  database software.  Standard  benchmarks, such as 
the Wisconsin, AS3AP, TP1, TPC-A, TPC-B, TPC-C, OO1, and OO7 
benchmarks  have  been widely  used  to develop  performance 
model, to compare  alternative  designs, to pinpoint  system 
bottlenecks, to select  different  software, and  to predict 
system behavior [Gray 1993]. They serve as the indispensable 
tools  in  assisting  academics, practitioners, programmers, 
benchmarkers,  and  even  managers,  to  validate   database 
research results, to verify software prototype  improvement, 
and  to  approve  the  systems  selection   and  procurement 
process. It has hence become important and beneficial for us 
to  understand   the  relevance,  concepts,  and  complexity 
involved in benchmark modeling, design and implementation in 
order  to capture  the center  of the datab  ase  technology 
improvement and advancement [Sawyer 1993]. In this paper, we 
present   an   extensive,  comprehensive,  and   comparative 
overview  of the current  and standard  benchmarks, describe 
their  design  rationales, and contrast  their  differences. 
Basic concepts  of database benchmark  and workload modeling 
are  introduced  as  the  background.  In  addition  to  the 
standard  benchmarks, we  describe  a new  benchmark  method 
cal2led  the  Requirements-Driven   Database  Benchmark   (R 
Benchmark)  which  is  developed  to  address  th issues  of 
generality  and accuracy.  We portray  its architecture  and 
main  components.   We  then  compare  and  contrast   these 
benchmark  methods to offer a useful reference  for research 
and field studies. 
 
2.  Database Benchmarking  
 
A benchmark is a standard by which something can be measured 
or judged. 
A  database  benchmark  is  defined  as  a standard  set  of 
executable  instructions  which  are  used  to  measure  and 
compare the relative and quantitative  performance of two or 
more database  systems through  the execution  of controlled 
experiments [Highleyman 1989] and [Jain 1991]. 
Benchmarking  is therefore a process of evaluating different 
database software systems on the same or different  hardware 
platforms.  Each  experiment  is made  up  of two  kinds  of 
variables.  One is the set  of independent  variables  which 
will  affect  the performance  of database  systems  and are 
called the experimental factors. 
Examples  include  the database  size, query complexity, and 
system  configuration.  The other  is the  set  of dependent 
variables  which  represent  the  quantitative  measurements 
collected from the benchmarking process. They are called the 
performance metrics.  Common performance metrics include the 
throughput  metric  which  is the ain time Cpoeriod  and the 
response  time metric  which is the ratio of time spent over 
certain work volume.  Since the benchmark  is used to answer 
the question of what database system should I purchase.  The 
answer is always the system that does the job correctly  and 
most economically. There is the price metric which considers 
the five-year cost of ownership  including  the hardware and 
software  purchase  and maintenance  costs over a period  of 
five ce/performance metric. 
 
 
3.  Workload Modeling 
 
Benchmark   results   depend   on  the   workload,  specific 
application    requirements,   and   system    design    and 
implementation. A workload is the amount of work assigned to 
or performed  by a worker or unit of workers in a given time 
period.  The workload of a database benchmark  is the amount 
of work assigned  to or performed  by a database system in a 
given period of time. The scope and scale of benchmark hence 
rely on the workload defined [Gray 1993]. In theory, if W is 
a workload for a benchmark then the set of work oads is Wi = 
{W1, W2,..., Wm}, where m is the number  of workloads  to be 
used, and if S represents  a system  under test then the set 
of systems  is Sj = {S1, S2,..., Sn}, where  n is the number 
of systems  to be tested, and  if P is a performance  result 
gathered in benchmark then the set of performance results is 
Pij = {P11, P12,...., Pmn}, where P11 is the result from the 
first workload on the first system, P12 is the result of the 
first workload on the 1983]. 
Therefore, with  a different  set  of  workloads, we  obtain 
distinct  set of performance  readings.  The  importance  of 
workload  is to  determine  the  nature  and  extent  of the 
benchmarking. 
 
Workloads are best described by the amount of work, the rate 
at  which  the  work  is created, and  the  characteristics, 
distribution, and content of the work.  Workloads consist of 
test  database  and test  operation.  Depending  upon  their 
characteristics, they  can  be  categorized  into  empirical 
workload, synthetic workload, and mixed workload.  Empirical 
workloads  are the actual  data and real applications.  They 
are the ideal tests  of performance  because  they represent 
the realistic environment.  However, empiric l workloads are 
usually difficult to obtain and sometimes  very expensive to 
install.  In addition, they are hard to scale  or change  in 
compliance  with the specific purposes set out for a certain 
benchmark.  Synthetic workloads  are more often used because 
they are easy to obtain, easy  to scale, easy to change, and 
easy  to achieve  the various  test designs  of a benchmark. 
Mixed  workloads  combine  real  data  with artificial  test 
queries.  They  can be a good  candidate  for  pro-realistic 
benchmarking. 
 
Workload  modeling  is critical  to the success  of database 
benchmarking.  The modeling is to characterize  the database 
workload and to identify the factors which affect the system 
performance.    It   involves   workload   formulation   and 
construction, in  detail: domain  analysis,  modeling  level 
determination,  components  identification,  characteristics 
parameterization,  data  collection,  and  parameter   value 
assignment,  and  representativeness   validation  [Ferrari, 
Serazzi, and Zeigner 1983]. 
 
In database benchmarking, workloads  are usually modelled at 
the functional  level where a workload  is characterized  by 
the applications  it consists  of, as well as at the virtual 
level  where  the  logical  resources  of the  system  to be 
consumed is considered.  In validating  the model, there are 
four  levels  of representativeness: (1) W is representative 
of  the  real  workload  if  it demands  the  same  physical 
resources in the same proportions  as the real workload, (2) 
W is representative  of the real workload  if it d mands the 
same  physical  resources  at  the  same  rate  as the  real 
workload, (3) W is representative of the real workload if it 
performs the same functions  in the same proportions  as the 
real  workload, and  (4)  W is representative  of  the  real 
workload  if the same values  of performance  metrics, P, as 
the real workload when ruonning on the same systems, S.  The 
fourth  leovel of validation  is usually  very difficult  to 
achieve.  To elaborate, the workload modeling comprises data 
analysis   and   transaction    analysis.    Data   analysis 
charactaerize data in terms of the size of the database, the 
number  of  records, the  length  of records, the  types  of 
fields, the value  distributions  and correlations, the keys 
and indexing, the hit ratios, the  selectivity  factors, and 
the  joining   fields  and  tables.   Transaction   analysis 
characterizes    the   complexity   of   transactions,   the 
correlation  of transactions, the data input  into  t ds and 
tables  used by the transactions, the results  size, and the 
output  modes.  Both are further  analyzed  with the control 
requirements  of length  of test, number  of users, order of 
transactions, frequency  and  distribution  of transactions, 
and performance  metrics used.  Each parameter  can have the 
dimension of the level of complication involved.  Therefore, 
a desirable and successful benchmark depends on the workload 
to be relevant, portable, scalable, and creditable. 
 
4.  Standard Benchmarks 
 
Standard database benchmarks consist of: 
 
.       The Wisconsin Benchmark is a relational query benchmark; 
.       The AS3AP Benchmark is a complex mixed workload benchmark; 
.       The TP1 Benchmark, the TPC-A and TPC-B Benchmarks are on-line  
transaction processing (OLTP) benchmarks simulating one bank 
transaction  type;.  The TPC-C Benchmark  is a complex  OLTP 
benchmark  simulating  order  entry  and  inventory  control 
transactions   in  a  production  environment;.   The  TPC-D 
Benchmark  is  a Decision  Support  System  (DSS)  benchmark 
simulating  the complex read and reporting  workloads;.  The 
TPC-E Benchmark will be a large Enterprise  benchmark  which 
includes  large  and complicated  query  sets  for an entire 
enterprise;.   The  Client-Server   Benchmark   will   be  a 
specialized and complicated benchmark deigned specifically 
to test the client-server database software. 
 
We  categorize  these  benchmarks   into  relational   query 
standard  benchmarks  and  on-line  transaction   processing 
standard benchmarks, and introduce a new requirements-driven 
benchmark in the following sections. 
 
 
4.1 Relational Query Benchmarks 
 
The Wisconsin Benchmark 
 
The Wisconsin  Benchmark  described  in [Bitton, DeWitt, and 
Turbyfill   1983]  [Boral  and  DeWitt  1984]  [Bitton   and 
Turbyfill  1985]  [Bitton  and Turbyfill  1988], and [DeWitt 
1993]  is the first  effort  to systematically  measure  and 
compare the performance  of relational database systems with 
database  machines.  The  benchmark  is  a  single-user  and 
single-factor  experiment  using a synthetic  database and a 
controlled  workload.  It measures  the  query  optimization 
performance  of database  systems with 32 query types to exe 
cise  the  components  of the proposed  systems.  The  query 
suites  include  selection, join, projection, aggregate, and 
simple update queries. 
 
The test database  consists  of four generic relations.  The 
tenk relation is the key table and most used. Two data types 
of small integer  number and character  string are utilized. 
Data values are uniformly distributed. The primary metric is 
the query elapsed time. The main criticisms of the benchmark 
include  the nature of single-user  workload, the simplistic 
database  structure, and  the  unrealistic  query  tests.  A 
number of efforts have been made to extend the benchmark  to 
incorporate  the  multi-user  test.  However,  they  do  not 
receive  the  same  acceptance  as  the  original  Wisconsin 
benchmark   except  an  extension   work  called  the  AS3AP 
benchmark. 
 
The AS3AP Benchmark 
 
The  AS3AP  Benchmark  stands  for  the  ANSI  SQL  Standard 
Scaleable  and Portable  Benchmark  described  in [Turbyfill 
1987]  [Turbyfill, Orji, and  Bitton  1989], and [Turbyfill, 
Orji, and Bitton  1993].  The benchmark  models complex  and 
mixed workloads, including single-user and multi-user tests, 
and  operational   and  functional   tests.   There  are  39 
single-user  queries  consisting  of  utilities,  selection, 
join, projection, aggregate, integrity, and bulk updates. 
 
The four multi-ubser  modules  include  a concurrent  random 
read  stest  or pure  information  retrieval  (IR)  test  to 
execute  a  one-row  selection   on  the  same  relation,  a 
concurrent  random  write  test or pure on-line  transaction 
processing  (OLTP) test to execute  a one-row updat)e on the 
same relation, a mixed IR test, and a mixed  OLTP test.  The 
concurrent random read test is to examine the maximum number 
of concurrent  users the system can perform  retrieving  the 
same relation.  The concurrent  random write test is t spect 
the maximum number of concurrent users the system can handle 
updating the same relation.  The mixed IR test and the mixed 
OLTP  test are to measure  the effects  of the cross-section 
queries  on the  system  with  concurrent  random  reads  or 
concurrent random writes. 
 
The test database  consists of four generic relations.  Each 
has  the same  number  of fields  and  the  same  number  of 
records.  The database scales up by increasing the number of 
records  in each  table.  A number  of data types  are used, 
including  long integer  number, double  precision  floating 
point number, decimal  number, money, datetime, fixed-length 
and  variable-length  character  strings.  Data  values  are 
created with uniform and non-uniform  data distributions.  A 
new  performance  metric, the equivalent  database  size, is 
defined  to measure the largest  database  size the proposed 
system  can  process  within  a  12-hour  time  limit.   The 
benchmark  tries to provide a balanced workload  to test the 
system  performance  on  utilities,  access  methods,  query 
optimization, and concurrency control. 
 
 
 
 
 
 
4.2 On-Line Transaction Processing Benchmarks 
 
The TPC-A and the TPC-B Benchmarks 
 
The Transaction  Processing  Performance  Council (TPC) is a 
standards organization  established  in 1988 by a consortium 
of hardware and software systems vendors.  The objective  of 
the organization  is to define industry standard  benchmarks 
for database  and transaction  processing  systems.  TPC has 
announced  three  standard  OLTP  benchmarks, including  the 
TPC-A, TPC-B, and TPC-C  benchmarks  in 1989  and 1992.  The 
Council  is  currently  developing  the  TPC-D  benchmark, a 
complex   DSS   benchmark,  and  the   TPC-E   benchmark,  n 
enterprise-wide  benchmark, and  a client-server  benchmark. 
TPC-D and TPC-E are expected to be approved  in 1995 and C-S 
is scheduled to announce in 1997. 
 
The TPC-A Benchmark  announced  in 1989 and revised  in 1992 
[TPC-A  1989  and  1992]  is a standarization  of an earlier 
ET1/TP1  Datamation  Benchmark  described  in [Anon  et  al. 
1985].   The  benchmark   is  an  update-intensive   on-line 
transaction  processing (OLTP) benchmark which simulates one 
hypothetical   bank   transaction   type   in  a  networking 
environment. It schedules the transactions continuously with 
an exponential  think time of 10 seconds.  The test database 
consists  of four relations.  The size of each relation on a 
fixed scaling ratio with each other based on an estimate  of 
the throughput metric. 
 
In the  standard, TPC  formally  defines  three  performance 
metrics, including  the transaction  per  second  (tps), the 
response  time  (rt), and  the five-year  cost  of ownership 
metrics.  The  tpsA-Local  metric  is used  in a local  area 
network (LAN) environment  and the tpsA-Wide  metric is used 
in the wide area network (WAN) environment. The database and 
workload  scaling  is based  on the  estimated  tps  number. 
Furthermore,  TPC  requires   the   atomicity,  consistency, 
isolation, durability  (ACID) properties of transactions  to 
be tested  and  supported.  The full-disclosure  report  and 
independent  audit requirements  are rigorously  defined and 
recommended.  The main  criticism  for the benchmark  is the 
over-simplified standard workload using one transaction type 
which tests limited  aspects  of the system performance  and 
represents  a narrow spectrum  of OLTP workloads.  The TPC-B 
Benchmark  approved  in 1989 and revised in 1992 [TPC-B 1989 
and 1992] is also an OLTP benchmark.  The benchmark uses the 
same  transaction  profile  as the  TPC-A  benchmark  and is 
considered  the  database  stress  test  in a non-networking 
environment  to be distinguished  from  TPC-A.  There  is no 
think  time  ihnvolved  in the transaction  generation.  The 
rtesponse  time metric is replaced by the CPU residence time 
metric.  The tpsB metric is defined  to substitute  the tpsA 
metrics.  s will cease to be used in the summer of 1 995 and 
will be replaced by the TPC-C Benchmark. 
 
The TPC-C Benchmark 
 
The TPC-C Benchmark  announced  in 1992 and revised  in 1993 
[TPC-C  1992  and 1993]  is a complex  OLTP  benchmark.  The 
benchmark  emulates  hypothetical  order entry and inventory 
control  transactions   in  a  production  environment.   It 
imitates  a  wholesale   supplier  which  has  one  or  more 
warehouses   supplying   items  to  sales  districts.   Each 
warehouse is responsible  for 10 districts and each district 
serves  3,000 customers.  Customers  can call in or drop  by 
orders. Each customer order contains 10 items on average. Ea 
h regional warehouse  maintains 100,000 items in stock.  The 
benchmark is referred to as the order entry benchmark. 
 
There  are five transaction  types defined  in the workload. 
The  first  transaction  is a new  order  transaction  which 
portrays   a  mid-weight,  read-write,  and   high-frequency 
transaction. The second transaction is a payment transaction 
which    portrays    a    light-weight,   read-write,    and 
high-frequency  transaction.  The  third  transaction  is an 
order  status  transaction   which  portrays  a  mid-weight, 
read-only,  and  low-frequency   transaction.   The   fourth 
transaction  is  a delivery  transaction  which  portrays  a 
mid-weight,  re  d-write, and  background  transaction.  The 
fifth  transaction   is  a  stock  level  transaction  which 
portrays   a  heavy-weight,  read-only,  and   low-frequency 
transaction. 
 
The TPC-C database consists of nine relations. Each relation 
corresponds  to an entity  in the  hypothetical  wholesaler. 
Four  data  types, including  long  integer  number,  double 
precision  floating point number, datetime, and fixed-length 
character  string, are used.  Field values are created  with 
uniform  and non-uniform  data  distributions.  The database 
scales  up with  the  number  of warehouses.  A new  metric, 
transaction  per minute (tpmC), is defined to represent  the 
business throughput of the more complex OLTP workload. 
 
[Gray  1993]  [Leutenegger  and Dias 1993], and [Raab  1993] 
consider TPC-C 10 times heavier than the TPC-A and the TPC-B 
benchmarks.  It is felt  that  the TPC-C  benchmark  is more 
representative of the computer use in the 1990's. 
 
The TPC-D Benchmark 
 
The  TPC-D  Benchmark  is a Decision  Support  System  (DDS) 
benchmark  currently under development  by the TPC committee 
and is expected to announce in the mid-1995. TPC-D addresses 
complex queries of DSS workload which is the analysis of the 
results of OLTP transactions.  It models a global enterprise 
business  environment  in which decision support queries are 
run  against   the  output  of  a  non-stop   OLTP  business 
environment   [TPC-D  1995].   Three  components   of  TPC-D 
differentiate  it from the previous  TPC benchmarks.  One is 
the manipulation  of large  data sets that TPC-D  is looking 
for aggregation  and heavily depending  upon ORDER BY, GROUP 
BY, and JOIN queries. The other is data correlation in which 
interrelation  of data is the core of the TPC-D  query  set. 
Another is the data derivation that TPC-D projects new views 
of data and operates  against  them.  The base size  of test 
database is six hundred megabytes. TPC-D mainly measures the 
execution time of a complete query set. 
 
The TPC-E Benchmark 
 
The  TPC-E   Benchmark   is  an  Enterprise-wide   benchmark 
currently  under  development  by the TPC  committee  and is 
expected  to announce  in the end of 1995.  TPC-E  addresses 
both  complex  queries  in OLTP systems  and DSS queries  in 
non-OLTP environments.  The Enterprise  benchmark uses TPC-C 
as its foundation, and evolves toward the complex end of the 
transaction spectrum. First, the TPC-E workload will require 
a more  aggressive  response  time  than  TPC-C.  Second, an 
on-line decision support query and concurrent  batc activity 
will  be added.  Third, system-level  recovery  metrics  are 
required  in  a seven  days  a week  and  twenty  four  hour 
operational environment [TPC-E 1995]. 
 
4.3e Requirements-Driven Database Benchmark 
 
Tihe   above   standard   benchmarks   are   synthetic   and 
domain-specific   in  that   they  model   certain   typical 
applications  in  a  problem  domain  and  create  synthetic 
workloads. Test results from these benchmarrks are estimates 
of possible  system performance  for certain  pre-determined 
application  types.  When the user domain  differs  from the 
standard  problem domains or if the application  workload is 
divergent  from the synthetic workloads, they do not provide 
an effective means to measure the system's performance on th 
ser's problem domain.  Database system performance on actual 
domain data and transactions may vary significantly from the 
benchmark. 
 
To   address    the    important    issues    of   accuracy, 
representativeness,  and   generality,  we  propose   a  new 
benchmark  approach  which is called the Requirements-Driven 
Database Benchmark  (R Benchmark)  described  in [Seng, Yao, 
and Hevner  1995].  The R Benchmark  is a domain-independent 
and workload-independent  benchmark that models the workload 
development   in  a  process  of  workload   representation, 
transformation, and generation.  It is an application-driven 
and a more general  method  which derives  benchmark  suites 
from he user domain  and produces  test workloads  from  the 
transaction specifications. 
 
The R Benchmark consists of three main components.  They are 
a   high-level   application   specification   language,   a 
translator  of the language, and a set of generators for the 
database   and  the   system   transactions.   We  use   the 
specification  language  to  model  and  formalize  workload 
requirements.  We translate and transform the specifications 
in the translator  to produce code generation.  We apply the 
generators to create the test databases and test workloads. 
 
So far, we have used several standard  benchmarks  including 
the TPC-A, Wisconsin, and AS3AP benchmarks  to validate  the 
method.  Test results  have shown that our method  can model 
these benchmarks  and these standard benchmarks are a subset 
of  our  method  [Seng,  Yao,  and  Hevner  1995].   We  are 
continuing   the  experiments   with  the  TPC-C  and  TPC-D 
benchmarks to further illustrate the validity and generality 
of this new approach. 4.4 Comparative Illustration 
 
We compare these benchmarks using an illustration in Table 1 
with six criteria. These criteria include (1) the creator(s) 
and creation  time, (2) the  nature  of the benchmark, i.e., 
whether  it  is  a relational  query  or an  OLTP  or  a DSS 
benchmark, (3) the networking  consideration, (4)  the  main 
metrics  adopted  or created, (5) the test  workloads, i.e., 
the query  types  or transaction  types, (6) the data types, 
(7) the data  distribution  used  to generate  the synthetic 
database values. 
 
 
5. Conclusions 
 
We have  described  the major  standard  benchmarks  in this 
paper.  We  have  discussed  their  designs, components, and 
comparisons.  We have portrayed  a new approach  to database 
benchmarking in addressing the critical issues of generality 
and representativeness. An open benchmark development method 
which is based on the user's requirements is a promising way 
to  manage  the  increasingly   complicated  and  voluminous 
database performance  evaluation.  We continue witnessing an 
active  development  in  the  benchmarking  ar a.  Important 
research   issues   include   modeling   large  and  complex 
workloads,  formulating   fair   and  accurate   performance 
metrics,  developing   tuning   strategies,  and   exploring 
benchmark generalization. 
 
 
References 
 
[Anon et al.  1985]  Anon et al., "A Measure  of Transaction 
Processing  Power, "  Datamation, 3(17): 112-118,  April  1, 
1985. 
 
[Bitton,  DeWitt, and  Turbyfill  1983]  Bitton, D.,  D.  J. 
DeWitt, and C. Turbyfill, "Benchmarking Database Systems - A 
Systematic  Approach," Proceedings  of the 9th International 
Conference on Very Large Data Bases, August 1983, pp. 8-19. 
 
[Bitton  and Turbyfill  1985] Bitton, D.  and C.  Turbyfill, 
"Design and Analysis  of Multiuser  Benchmarks  for Database 
System," Proceedings of the HICSS-18 Conference, 1985. 
 
[Bitton and Turbyfill 1988] Bitton, D. and C.  Turbyfill, "A 
Retrospective  on  the  Wisconsin  Benchmark,"  appears  ins 
Readings in Database Systems, Ed. by M. Stonbebraker, Morgan 
Kaufmann, Inc., 1988, pp. 280-299. 
 
[Boral  and DeWitt  1984]  Boral, H.  and D.  J.  DeWitt, "A 
Methodology  for Database  System  Performance  Evaluation," 
Proceedings of the 1984 ACnM SIGMOD International Conference 
on Management of Data, May 1984, pp. 176-185. 
 
[DeWitt 1993] DeWitt, D. J., "The Wisconsin Benchmark: Past, 
Present, and Future," appears in The Benchmark  Handbook for 
Database  and  Transaction  Processing  Systems, Ed.  by Jim 
Gray, Morgan Kaufmann, Inc., 1993, pp. 269-316. 
 
[Ferrari,  Serazzi,  and  Zeigner   1983]  Ferrari,  D.,  G. 
Serazzi, and A. Zeigner., Measurement and Tuning of Computer 
Systems, Prentice Hall, Inc., 1983. 
 
[Gray 1993] Gray, J. N., The Benchmark Handbook for Database 
and  Transaction  Processing  Systems,Ed., Morgan  Kaufmann, 
Inc., 1993. 
 
[Highleyman 1989] Highleyman, W. H., Performance Analysis of 
Transaction Processing Systems, Prentice Hall, Inc., 1989. 
 
[Jain   1991]   Jain,  R.,  The  Art  of  Computer   Systems 
Performance  Analysis: Techniques  for Experimental  Design, 
Measurement, Simulation, and Modeling, John Wiley Co., 1991. 
 
[Leutenegger and Dias 1993] Leutenegger, S. T., and D. Dias, 
"A Modeling  Study of the TPC-C  Benchmark," Proceedings  of 
the 1993 ACM SIGMOD International  Conference  on Management 
of Data, May 1993, pp. 22-31. 
 
[Patterson and Hennessy 1990] Patterson, D. A. and Hennessy, 
J.  L.,  Computer  Architecture:  A  Quantitative  Approach, 
Morgan Kaufmann Co., 1990. 
 
[Raab  1993] Rabb, F., "Overview  of the TPC Benchmark  C: A 
Complex  OLTP Benchmark," appears in The Benchmark  Handbook 
for Database and Transaction Processing Systems, Ed.  by Jim 
Gray, Morgan Kaufmann, Inc., 1993, pp. 131-144. 
 
[Sawyer  1993]  Sawyer, T.,  "Doing  Your  Own  Benchmark, " 
appears  in  The  Benchmark   Handbook   for  Database   and 
Transaction  Processing  System, Ed.  by  Jim  Gray,  Morgan 
Kaufmann, Inc., 1993, pp. 543-562. 
 
[Seng, Yao, and Hevner 1995] Seng, J.-L., S.  B. Yao, and A. 
R.  Hevner, "Requirements-Driven  Database Systems Benchmark 
Method," In Progress, 1995. 
 
[Serlin  1993] Serlin, O., "The History  of DebitCredit  and 
the TPC," appears in The Benchmark Handbook for Database and 
Transaction   Processing   Systems,  by  Jim   Gray,  Morgan 
Kaufmann, Inc., 1993, pp. 21-40. 
 
[TPC-A 1992] Transaction Processing Performance Council, TPC 
Benchmark A, Standard Specification, March 1992. 
 
[TPC-B 1992] Transaction Processing Performance Council, TPC 
Benchmark B, Standard Specification, March 1992. 
 
[TPC-C 1993] Transaction Processing Performance Council, TPC 
Benchmark C, Standard Specification, 1993. 
 
[TPC-D 1995] Transaction Processing Performance Council, TPC 
Benchmark D, Working Draft v. 9.0, January 1995. 
 
[TPC-E 1995] Transaction Processing Performance Council, TPC 
Benchmark E, Working Draft v. 7.3, February 1995. 
 
[Turbyfill 1987] Turbyfill, C., Comparative Benchmarking  of 
Relational    Database    Systems,   Unpublished    Doctoral 
Dissertation, TR87-871, Cornell University, 1987. 
 
[Turbyfill, Orji, and Bitton  1989] Turbyfill, C., C.  Orji, 
and D.  Bitton, "AS3AP -  A Comparative  Relational Database 
Benchmark, "  Proceedings  of  the  IEEE  COMPCON, 1989, pp. 
560-564. 
 
[Turbyfill, Orji, and Bitton  1993] Turbyfill, C., C.  Orji, 
and D.  Bitton, "AS3AP: An ANSI SQL Standard  Scaleable  and 
Portable Benchmark for Relational Database Systems," appears 
in The  Benchmark  Handbook  for  Database  and  Transaction 
Processing Systems, Ed.  by Jim Gray, Morgan Kaufmann, Inc., 
1993, pp. 317-358. 
 
 
BENCHMARK  CREATOR  NATURE NETWORKING  MAIN METRICS WORKLOAD 
DATA TYPE DATA DIST. 
 
WISCONSIN  DEWITT, BITTON, TURBYFILL  1983 SIMPLE QUERY SETS 
SINGLE USER ELAPSED TIME SELECTION, JOIN, PROJECTION, SIMPLE 
UPDATE SMALL INTEGER, FIXED LENGTH STRING UNIFORM 
 
AS3AP 
TURBYFILL  1987 
COMPLEX  QUERY SETS MULTIUSER  ELAPSED 
TIME,   LARGEST   DATABASE   SIZE   LOAD,  SELECTION,  JOIN, 
PROJECTION, BULK  UPDATE  INTEGER, FLOATING  POINT, DECIMAL, 
MONEY, DATETIME, FIXED AND VARIABLE  LENGTH  STRING  UNIFORM 
AND NON-UNiFORM 
 
TPC-A 
TPC  1989 
SIMPLE   OLTP 
NETWORKING 
TPS-A,  RT, PRICE/NPERFORMANCE 
BANK TXN INTEGER UNIFORM 
 
TPCT-B 
TPC  1989 
SIMPLE  OLTP 
NON-NETWORKING 
TPS-B,  RT, PRICE/PERFORMANCE 
BANK TXN INTEGER UNIFORM 
 
TPC-C 
TPC 1992 
COMPLEX OLTP 
NETWORKING 
TPM-C, RT, PRICE/PERFORMANCE 
NEW-ORDER, PAYMENTT, ORDER STATUS, DELIVERY, STOCK-LEVEL TXN 
INTEGER, FLOATING POINT, DECIMAL, STRING 
UNIFORM AND NON-UNIFORM 
 
TPC-D 
TPC 1995 
COMPLEX DSS 
NETWORKING 
TPC-D POWER, THROUGHPUT, PRICE/PERFORMANCE 
16 REPORT AND 2 UPDATE TXN 
INTEGER, FLOATING POINT, STRING 
UNIFORM AND NON-UNIFORM 
 
TPC-E 
TPC 1995 
COMPLEX NON-OLTP 
NETWORKING 
TPC-E THROUGHPUT, RT, UTILIZATION  
NEW-ORDER, PAYMENT, ORDER-STATUS, DELIVERY, STOCK-LEVEL,  
CUSTOMER-DEMOGRAPHICS, CUSTOMER  INQUIRY, BATCH TXN INTEGER, 
     FLOATING POINT, STRING 
UNIFORM AND NON-UNIFORM 
 
REQUIREMENTS-DRIVEN 
SENG, YAO, HEVNER 1994 
OPEN 
NETWORKING 
THROUGHPUT, RT, UTILIZATION  
USERS  SPECIFY 
INTEGER, FLOATING POINT, DECIMAL, MONEY, DATETIME, STRING 
UNIFORM AND NON-UNIFORM 
 
Table 1:  Comparison of Standard Benchmarks 
 
 
Author 
 
Jia-Lang Seng 
received  her Ph.D.  degree  in MIS from  the University  of 
Maryland at College Park, Maryland, USA. She is currently an 
Associate  Professor  of  MIS  at  the  National  Cheng  Chi 
University,  Taipei,  Taiwan  ROC.  Her  research  interests 
center around the benchmarking of database systems, software 
requirements analysis, and accounting information systems. 
