罗军舟
金嘉晖
宋爱波
东方
东南大学计算机科学与工程学院,江苏南京211189
摘要:系统地分析和总结云计算的研究现状,划分云计算体系架构为核心服务、服务管理、用户访问接口等3个层次。围绕低成本、高可靠、高可用、规模可伸缩等研究目标,深入全面地介绍了云计算的关键技术及最新研究进展。在云计算基础设施方面,介绍了云计算数据中心设计与管理及资源虚拟化技术;在大规模数据处理方面,分析了海量数据处理平台及其资源管理与调度技术;在云计算服务保障方面,讨论了服务质量保证和安全与隐私保护技术。针对新型的云计算应用和云计算存在的局限性,又探讨并展望了今后的研究方向。最后,介绍了东南大学云计算平台以及云计算研究与应用方面的相关成果。
云计算;虚拟化;数据中心;海量数据处理;服务质量;安全与隐私
TP393
A
1000-436X(2011)07-0003-19
Cloud computing: architecture and key technologies
LUO Jun-zhouJIN Jia-huiSONG Ai-boDONG Fang
2011-05-202011-06-30
基金项目:国家自然科学基金资助项目(61070161, 61070158,61003257,60773103,90912002);国家重点基础研究发展计划
(“973”计划)基金资助项目(2010CB328104);国家科技支撑计划课题基金资助项目(2010BAI88B03);教育部博士点基金课题基金资助项目(200802860031);江苏省自然科学基金资助项目(BK2008030);国家科技重大专项课题基金资助项目(2009ZX03004-004-04):江苏省“网络与信息安全”重点实验室基金资助项目(BM2003201);“计算机网络与信息集成”教育部重点实验室项目(93K-9)
万方数据・4・
万方数据万方数据・5・
・6・
万方数据万方数据・7・
・8・
万方数据万方数据・9・
・10・
万方数据・11・
万方数据・12・
万方数据万方数据・13・
・14・
万方数据・15・
万方数据 ・16・
万方数据万方数据・17・
・18・
万方数据@@[1] THUSOO A, SHAO Z, ANTHONY S, et al. Data warehousing and
analytics infrastructure at facebook[A]. SIGMOD'10[C]. Indianapolis,
Indiana, USA: ACM, 2010.1013-1020.
@@[2] 淘宝数据开放策略[EB/OL]. http://www.aliresearch.com/wp-content
/uploads/2011/03/taobaoshuju.pdf. 2011.
The open strategy of taobao's data[EB/OL] http://www.alire search.com/wp-content/uploads/2011/03/taobaoshuju.pdf. 2011.
@@[3] GILLEN A, BROUSSARD F W, PERRY R, et al. Optimizing infra
structure: the relationship between it labor costs and best practices for
managing the windows desktop[EB/OL]. http://download.micresoft. com/download/a/4/4/a4474b0c-57d8-41a2-afe6-32037fa93ea6/ IDC_
windesktop_IO_whitepaper.pdf 2007.
@@[4] MELL P, GRANCE T. The NIST Definition of Cloud Computing[R].
National Institute of Standards and Technology, 2011.
@@[5] Amazon EC2[EB/OL]. http://aws.amazon.com/ec2 2011.
@@[6] DEAN J, GHEMAWAT S. MapReduce: a flexible data processing
tool[J]. Commun ACM, 2010, 53(1): 72-77.
@@[7] Google App Engine[EB/OL]. http://code.google.com/appengine/2011.
@@[8] Salesforce CRM[EB/OL]. http://www.salesforce.com/.
@@[9] MILOJICIC D, WOLSKI R. Eucalyptus: delivering a private cloud[J].
Computer. 2011, 44(4): 102-104.
@@[10] FOSTER I, YONG Z, RAICU I, et al. Cloud Computing and Grid
Computing 360-Degree Compared[Z]. 2008.1-10.
@@[11 ] Fscebook server count: 60 000 or more[EB/OL]. http://www. datacen
terknowledge.com/archives/2010/06/28/faeebook-server-count-60000
-or-more/. 2011.
@@[12] Google investor relations[EB/OL].http://investor.google.com/financial.
2011.
@@[13] GREENBERG A, HAMILTON J, MALTZ D A, et al. The cost of a cloud: research problems in data center networks[J]. SIGCOMM
Comput Commun Rev, 2008, 39: 68-73.
@@[14] GREENBERG A, HAMILTON J R, JAIN N, et al. VL2: a scalable and flexible data center network[A]. SIGCOMM'09[C]. Barcelona,
Spain: ACM, 2009.51-62.
@@[15] MYSORE R N, PAMBORIS A, FARRINGTON N, et al. PortLand: a
scalable fault-tolerant layer 2 data center network fabric[A]. SIG
COMM'09[C]. Barcelona. Spain: ACM, 2009.39-50.
@@[16] GUO C, WU H, TAN K, et al. Dcell: a scalable and fault-tolerant network structure for data centers[A]. SIGCOMM'08[C]. Seattle, WA,
万方数据 USA: ACM, 2008.75-86.
@@[17] GUO C, LU G, LI D, et al. BCube: a high performance, server-centric
network architecture for modular data centers[A]. SIGCOMM'09[C].
Barcelona, Spain: ACM, 2009.63-74.
@@[18] HOELZLE U, BARROSO L A. The Datacenter as a Computer: An
Introduction to the Design of Warehouse-Scale Machines[M]. 1st ed.
Morgan and Claypool Publishers, 2009.
@@[19] NATHUJI R, SCHWAN K. VirtualPower: coordinated power man
agement in virtualized enterprise systems[A]. SOSP '07[C]. New York,
NY, USA: ACM, 2007.265-278.
@@[20] PALLIPADI V, STARIKOVSKIY A. The ondemand governor: past, present and future[A]. Proceedings of Linux Symposium[C]. 2006.
223-238.
@@[21] RAO L, LIU X, LE XIE, et al. Minimizing electricity cost: optimiza
tion of distributed intemet data centers in a multi-electricity-market
environment[A]. INFOCOM'10[C]. San Diego, California, USA:
IEEE Press, 2010.1145-1153.
@@[22] SAMADIANI E, JOSHI Y, MISTREE F. The Thermal Design of a Next Generation Data Center: a Conceptual Exposition[Z]. 2007.
93-102.
@@[23] CHEN G, HE W, LIU J, et al. Energy-aware server provisioning and load dispatching for connection-intensive internet services[A].
NSDI'08[C]. Berkeley, CA, USA: USENIX Association, 2008.
337-350.
@@[24] About virtual machine templates[EB/OL].http://technet. micro
soft.com/en-us/library/bb740838.aspx, 2011.
@@[25] VRABLE M, MA J, CHEN J, etal. Sealability, fidelity, and contain
ment in the potemkin virtual honeyfarm[A]. SOSP'05[C]. Brighton,
United Kingdom: ACM, 2005.148-162.
@@[26] ANDR H, LAGAR-CAVILLA S, WHITNEY J A, et al. SnowFlock:
virtual machine cloning as a first-class cloud primitive[J]. ACM Trans
Comput Syst, 2011, 29(1): 1-45.
@@[27] CLARK C, FRASER K, HAND S, et al. Live migration of virtual
machines[A]. NSDI'05[C]. USENIX Association, 2005.273-286.@@[28] HIROFUCHI T, NAKADA H, OGAWA H, et al. A live storage migra
tion mechanism over wan and its performance evaluation[A].
VTDC'09[C]. Barcelona, Spain: ACM, 2009.67-74.
@@[29] CULLY B, LEFEBVRE G, MEYER D, et al. Remus: high availability
via asynchronous virtual machine replication[A]. NSDI'08[C]. San
Francisco, California: USENIX Association, 2008.161 - 174.@@[30] GHEMAWAT S, GOBIOFF H, LEUNG S. The Google file system[A].
SOSP'03[C]. Bolton Landing, NY, USA: ACM, 2003.29-43.
@@[31] CHANG F, DEAN J, GHEMAWAT S, et al. Bigtable: a distributed
storage system for structured data[J]. ACM Trans Comput Syst, 2008,
26(2): 1-26.
@@[32] DECANDIA G, HASTORUN D, JAMPANI M, et al. Dynamo: ama
zon's highly available key-value store[A]. SOSP'07[C]. Stevenson,
Washington, USA: ACM, 2007.205-220.
@@[33] PIKE R, DORWARD S, GRIESEMER R, et al. Interpreting the Data:
Parallel Analysis with Sawzall[J]. Scientific Programming Journal,
2005, 13(4): 227-298.
@@[34] OLSTON C, REED B, SRIVASTAVA U, et al. Pig latin: a
not-so-foreign language for data processing[A]. SIGMOD'08[C]. New
York, NY, USA: ACM, 2008.1099-1110.
@@[35] EKANAYAKE J, LI H, ZHANG B, et al. Twister: a runtime for itera
tive MapReduce[A]. HPDC'10[C]. Chicago, Illinois: ACM,
2010.810-818.
@@[36] YANG H, DASDAN A, HSIAO R, et al. Map-reduce-merge: simpli
fied relational data processing on large clusters[A]. SIGMOD'07[C].
New York, NY, USA: ACM, 2007.1029-1040.
@@[37] WANG Y, SONG A, LUO J. A MapReduceMerge-based Data Cube
Construction Method[Z]. 2010.1-6.
@@[38] XIONG R, LUO J, SONG A, et al. QoS preference-aware replica selection strategy using mapreduce-based PGA in data grids[A].
ICPP' 11 [C].Taipei, Taiwan, China.
@@[39] VERMA A, LLOR X, GOLDBERG D E, et al. Scaling genetic algo rithms using mapreduce[A]. ISDA'09[C]. IEEE Computer Society,
2009.13-18.
@@[40] ISARD M, BUDIU M, YU Y, et al. Dryad: distributed data-parallel programs from sequential building blocks[A]. EuroSys'07[C]. Lisbon,
Portugal: ACM, 2007.59-72.
@@[41] YU Y, ISARD M, FETTERLY D, et al. DryadLINQ: a system for
general-purpose distributed data-parallel computing using a high-level
language[A]. OSDI'08[C]. San Diego, California: USENIX Associa
tion, 2008.1 - 14.
@@[42] Microsoft Azure[EB/OL]. http://www.nicrosoft.com/windowsazure/.
2011.
@@[43] ELTABAK M Y, TIAN Y, OZCAN F, et al. CoHadoop: flexible data
placement and its exploitation in hadoop[A]. Proc VLDB Endow
ment[C]. 2011.
@@[44]郑湃,崔立真,王海洋等.云计算环境下面向数据密集型应用的数
据布局策略与方法[J].计算机学报.2010(8): 1472-1480.
ZHENG P, CUI L Z,WANG H Y, et al. A data placement strategy for data-intensive atplications in cloud[J].Chinese Journal of Com
puters,2010(8): 1472-1478.
@@[45] FISCHER M J, SU X, YIN Y. Assigning tasks for efficiency in Ha doop: extended abstract[A]. SPAA'10[C]. New York, NY, USA: ACM,
2010.30-39.
万方数据@@[46] JIN J, LUO J, SONG A, et al. BAR: an efficient data locality driven task scheduling algorithm for cloud computing[A].
CCGRID'1 1[C]. Newport Beach, CA, USA: IEEE Computer Soci
ety, 2011.295-304.
@@[47] ZAHARIA M, BORTHAKUR D, SEN SARMA J, et al. Delay sched
uling: a simple technique for schieving locality and fairness in cluster
scheduling[A]. EuroSys'10[C]. New York, NY, USA: ACM, 2010.
265-278
@@[48] ISARD M, PRABHAKARAN V, CURREY J, et al. Quincy: fair
scheduling for distributed computing clusters[A]. SOSP '09[C]. New
York, NY, USA: ACM, 2009.261-276.
@@[49] ZAHARIA M, KONWINSKI A, JOSEPH A D, et al. Improving MapReduce performance in heterogeneous environments[A]. OSDI'08[C]. Berkeley, CA, USA: USENIX Association,
2008.29-42.
@@[50] STANTCHEV V, SCHR O PFER C. Negotiating and Enforcing QoS and SLAs in grid and cloud computing[A]. GPC '09[C]. Berlin, Hei
delberg: Springer-Verlag, 2009.25-35.
@@[51] BUYYA R, BROBERG J, GOSCINSKI A M. Cloud Computing Principles and Paradigms[M]. Wiley Publishing, 2011.
@@[52] CALHEIROS R N, RANJANY R, BUYYA R. Virtual machine provi
sioning based on analytical performance and QoS in cloud computing
environments[A]. ICPP' 11 [C]. Taipei, Taiwan, China.
@@[53] XIAO Y, L1N C, JIANG Y, et al. Reputation-BASED QoS provision
ing in cloud computing via dirichlet multinomial model[A]. ICC'10[C].
2010.1-5.
@@[54] ANDRZEJAK A, KONDO D, YI S. Decision model for cloud computing under SLA constraints[A]. MASCOTS'10[C]. 2010.
257-266.
@@[55] SANTHANAM S, ELANGO P, ARPACI-DUSSEAU A, et al. De
ploying virtual machines as sandboxes for the grid[A]. WORLDS'05[C].
Berkeley, CA, USA: USENIX Association, 2005.7-12.
@@[56] RISTENPART T, TROMER E, SHACHAM H, et al. Hey, you, get off of my cloud: exploring information leakage in third-party
compute clouds[A]. CCS'09[C]. Chicago, Illinois, USA: ACM,
2009. 199-212.
@@[57] RAJ H, NATHUJI R. SINGH A, et al. Resource management for isolation enhanced cloud services[A]. CCSW'09[C]. New York, NY,
USA: ACM, 2009.77-84.
@@[58] ROY I, SETTY S T V, KILZER A, et al. Airavat: security and privacy
for MapReduce[A]. NSDT10[C]. Berkeley, CA, USA: USENIX Asso
ciation, 2010.20.
@@[59] LI J, WANG Q, WANG C, et al. Fuzzy keyword search over en crypted data in cloud computing[A]. INFOCOM'10[C]. Piscataway,
N J, USA: IEEE Press, 2010.441-445.
@@[60] YU S, WANG C, REN K, et al. Achieving secure, scalable, and fine-grained data access control in cloud computing[A]. INFO
COM'10[C]. Piscataway, N J, USA: IEEE Press, 2010.534-542.@@[61]冯登国,张敏,张妍等云计算安全研究[J]软件学报2011,22(1):
71-83
FENG D G, ZHANG M, ZHANG Y, et al. Study on Cloud Computing
Security, 2011, 22(1):71-83.
@@[62] ARMBRUST M, FOX A, GRIFFITH R, et al. Above the Clouds: A
Berkeley View of Cloud Computing[R]. EECS Department, Univer
sity of California, Berkeley, 2009.
@@[63] MORETTI C, BUI H, HOLLINGSWORTH K, et al. All-pairs:
an abstraction for data-intensive computing on campus grids[J]. IEEE Transactions on Parallel and Distributed Systems, 2010, 21:
33-46.
@@[64] DONG F, LUO J, SONG A, et al. Resource load based stochastic DAGs scheduling mechanism for grid environment[A]. HPCC'10[C].
Washington, DC, USA: IEEE Computer Society, 2010.197-204.
@@[65] ZHOU J, LUO J, SONG A. NETOP: a non-cooperative game based
topology optimization model towards improving search performance[J].
Journal of Internet Technology, 12(3): 477-490
@@[66] ZHOU J, LUO J, SONG A. Grid service discovery based on cross-vo service domain model[A]. NPC'08 [C]. Berlin, Heidelberg: Spring
er-Verlag, 2008.327-338.
@@[67] TIAN T, LUO J, WU Z. A replica replacement algorithm based on
value-cost prediction[J]. Lecture Notes in Computer Science, 2008:
365-373.
@@[68] OpenQRM [EB/OL]. http: www.openqrm.com, 2011.
万方数据 罗军舟( 1960-),男,浙江宁波人,博士,东南大学教授、博士生导师,主要研究方向为网格与云计算、下一代网络体系结构、协议工.程、网络安全和管理、服务计算。
金嘉晖(1986-),男,浙江温州人,东南大学博士生,主要研究方向为云计算、海量数据处理。
宋爱波( 1970-),男,山东烟台人,博士,东南大学副教授,主要研究方向为网格与云计算、海量数据处理、Petri网理论与
应用。
东方(1982-),男,江苏南京人,博士,东南大学讲师,主要研究方向为云计算、网格计算、海量数据处理。
因篇幅问题不能全部显示,请点此查看更多更全内容