参考文献
|
-
[1] Jo, M., Maksymyuk, T., Strykhalyuk, B., & Cho, C. H. (2015). Device-to-device-based heterogeneous radio access network architecture for mobile cloud computing. IEEE Wireless Communications, 22(3), 50-58.
連結:
-
[2] Beloglazov, A., Buyya, R., Lee, Y. C., & Zomaya, A. (2011). A taxonomy and survey of energy-efficient data centers and cloud computing systems. Advances in computers, 82(2), 47-111.
連結:
-
[5] Vasile, M. A., Pop, F., Tutueanu, R. I., Cristea, V., & Kołodziej, J. (2015). Resource-aware hybrid scheduling algorithm in heterogeneous distributed computing. Future Generation Computer Systems, 51, 61-71.
連結:
-
[6] Di Nitto, E., Ghezzi, C., Metzger, A., Papazoglou, M., & Pohl, K. (2008). A journey to highly dynamic, self-adaptive service-based applications. Automated Software Engineering, 15(3), 313-341.
連結:
-
[7] Xu, X., Cao, L., & Wang, X. (2016). Adaptive task scheduling strategy based on dynamic workload adjustment for heterogeneous Hadoop clusters. IEEE Systems Journal, 10(2), 471-482.
連結:
-
[8] Nightingale, E. B., Chen, P. M., & Flinn, J. (2006). Speculative execution in a distributed file system. ACM Transactions on Computer Systems (TOCS), 24(4), 361-392.
連結:
-
[11] Wang, W., Zhao, W., Cai, C., Huang, J., Xu, X., & Li, L. (2015). An efficient image aesthetic analysis system using Hadoop.Signal Processing: Image Communication,39, 499-508.
連結:
-
[12] Ghazi, M. R., & Gangodkar, D. (2015). Hadoop, MapReduce and HDFS: A Developers Perspective.Procedia Computer Science,48, 45-50.
連結:
-
[19] Friedl, M. A., & Brodley, C. E. (1997). Decision tree classification of land cover from remotely sensed data. Remote sensing of environment, 61(3), 399-409.
連結:
-
[20] Chen, C. W., Luo, J., & Parker, K. J. (1998). Image segmentation via adaptive K-mean clustering and knowledge-based morphological operations with biomedical applications. Image Processing, IEEE Transactions on, 7(12), 1673-1683.
連結:
-
[31] Srikanth, B. V. S., & Reddy, V. K. (2016). Efficiency of stream processing engines for processing BIGDATA Streams. Indian Journal of Science and Technology, 9(14).
連結:
-
[34] Low, Y., Bickson, D., Gonzalez, J., Guestrin, C., Kyrola, A., & Hellerstein, J. M. (2012). Distributed GraphLab: a framework for machine learning and data mining in the cloud. Proceedings of the VLDB Endowment, 5(8), 716-727.
連結:
-
[36] Li, Y., Wang, S., & Ding, X. (2010, September). Person-independent head pose estimation based on random forest regression. In Image Processing (ICIP), 2010 17th IEEE International Conference on (pp. 1521-1524). IEEE.
連結:
-
[38] Linden, G., Smith, B., & York, J. (2003). Amazon. com recommendations: Item-to-item collaborative filtering. IEEE Internet computing, 7(1), 76-80.
連結:
-
[43] Lin, C. Y., Tsai, C. H., Lee, C. P., & Lin, C. J. (2014, October). Large-scale logistic regression and linear support vector machines using Spark. In Big Data (Big Data), 2014 IEEE International Conference on (pp. 519-528). IEEE.
連結:
-
[44] Watson, G. S. (1967). Linear least squares regression
連結:
-
[45] . The Annals of Mathematical Statistics, 1679-1699.
連結:
-
[46] Golub, G. (1965). Numerical methods for solving linear least squares problems. Numerische Mathematik, 7(3), 206-216.
連結:
-
[48] Ohsowski, B. M., Dunfield, K. E., Klironomos, J. N., & Hart, M. M. (2016). Improving Plant Biomass Estimation in the Field Using Partial Least Squares Regression and Ridge Regression. Botany, (ja).
連結:
-
[49] Ohsowski, B. M., Dunfield, K. E., Klironomos, J. N., & Hart, M. M. (2016). Improving Plant Biomass Estimation in the Field Using Partial Least Squares Regression and Ridge Regression. Botany, (ja).
連結:
-
[50] Li, Y., Wang, S., & Ding, X. (2010, September). Person-independent head pose estimation based on random forest regression. In Image Processing (ICIP), 2010 17th IEEE International Conference on (pp. 1521-1524). IEEE.
連結:
-
[52] Prasad, A. M., Iverson, L. R., & Liaw, A. (2006). Newer classification and regression tree techniques: bagging and random forests for ecological prediction. Ecosystems, 9(2), 181-199.
連結:
-
[53] Li, Y., Wang, S., & Ding, X. (2010, September). Person-independent head pose estimation based on random forest regression. In Image Processing (ICIP), 2010 17th IEEE International Conference on (pp. 1521-1524). IEEE.
連結:
-
[54] Alsheikh, M. A., Niyato, D., Lin, S., Tan, H. P., & Han, Z. (2016). Mobile Big Data Analytics Using Deep Learning and Apache Spark. arXiv preprint arXiv:1602.07031.
連結:
-
[56] Shi, W., Zhu, Y., Huang, T., Sheng, G., Lian, Y., Wang, G., & Chen, Y. (2016). An Integrated Data Preprocessing Framework Based on Apache Spark for Fault Diagnosis of Power Grid Equipment. Journal of Signal Processing Systems, 1-16.
連結:
-
[58] Ramirez-Gallego, S., Garcia, S., Mourino-Talin, H., & Martinez-Rego, D. (2015, August). Distributed Entropy Minimization Discretizer for Big Data Analysis under Apache Spark. In Trustcom/BigDataSE/ISPA, 2015 IEEE (Vol. 2, pp. 33-40). IEEE.
連結:
-
[60] Domoney, W. F., Ramli, N., Alarefi, S., & Walker, S. D. (2015, December). Smart city solutions to water management using self-powered, low-cost, water sensors and apache spark data aggregation. In 2015 3rd International Renewable and Sustainable Energy Conference (IRSEC) (pp. 1-4). IEEE.
連結:
-
[63] Vavilapalli, V. K., Murthy, A. C., Douglas, C., Agarwal, S., Konar, M., Evans, R., ... & Saha, B. (2013, October). Apache hadoop yarn: Yet another resource negotiator. In Proceedings of the 4th annual Symposium on Cloud Computing(p. 5). ACM.
連結:
-
[64] White, T. (2012). Hadoop: The definitive guide. " O'Reilly Media, Inc.".
連結:
-
[66] Shan, M. C., & Murphy, M. C. (1994). U.S. Patent No. 5,325,525. Washington, DC: U.S. Patent and Trademark Office.
連結:
-
[70] Kokilavani, T., & Amalarethinam, D. G. (2011). Load balanced min-min algorithm for static meta-task scheduling in grid computing. International Journal of Computer Applications, 20(2), 43-49.
連結:
-
[71] Lee, Y. H., Leu, S., & Chang, R. S. (2011). Improving job scheduling algorithms in a grid environment. Future generation computer systems, 27(8), 991-998.
連結:
-
[73] Yu Jun, Xiang Hai et al., Spark Core Technology and Advanced Applications, China Machine Press, pp. 238-253, 2015.
連結:
-
[74] Maniezzo, A. C. M. D. V. (1992). Distributed optimization by ant colonies. In Toward a practice of autonomous systems: proceedings of the First European Conference on Artificial Life (p. 134). Mit Press.
連結:
-
[78] Chaiken, R., Jenkins, B., Larson, P. Å., Ramsey, B., Shakib, D., Weaver, S., & Zhou, J. (2008). SCOPE: easy and efficient parallel processing of massive data sets. Proceedings of the VLDB Endowment, 1(2), 1265-1276.
連結:
-
[79] Lama, P., & Zhou, X. (2012, September). Aroma: Automated resource allocation and configuration of mapreduce environment in the cloud. In Proceedings of the 9th international conference on Autonomic computing (pp. 63-72). ACM.
連結:
-
[80] Gunn, S. R. (1998). Support vector machines for classification and regression. ISIS technical report, 14, 85-86.
連結:
-
[83] 徐佳俊, 刘功申, 苏波, & 孟魁. (2016). 基于 Spark 的异构叢集调度策略研究 Adaptive Scheduling Strategy for Heterogeneous Spark Cluster. Computer Science and Application, 6(11), 692.
連結:
-
[85] J. MacQueen, "Some Methods for Classification and Analysis of Multivariate Observations," in Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, 1967, pp. 281-297.
連結:
-
[90] Massie, M. L., Chun, B. N., & Culler, D. E. (2004). The ganglia distributed monitoring system: design, implementation, and experience. Parallel Computing, 30(7), 817-840.
連結:
-
[92] Taieb, S. B., & Hyndman, R. J. (2014). A gradient boosting approach to the Kaggle load forecasting competition. International journal of forecasting, 30(2), 382-394.
連結:
-
[3] Buyya, R., Abramson, D., & Giddy, J. (2000, May). Nimrod/G: An architecture for a resource management and scheduling system in a global computational grid. In High Performance Computing in the Asia-Pacific Region, 2000. Proceedings. The Fourth International Conference/Exhibition on (Vol. 1, pp. 283-289). IEEE.
-
[4] Qureshi, K. (2001). An empirical study of task scheduling strategies for image processing application on heterogeneous distributed computing system. Parallel and Distributed Computer Graphics, 103.
-
[9] Apache Spark, https://zh.wikipedia.org/wiki/Apache_Spark, June 2017 (last accessed:2017/06/11)
-
[10] Khare, R., Cutting, D., Sitaker, K., & Rifkin, A. (2004). Nutch: A flexible and scalable open-source web search engine. Oregon State University, 1, 32-32.
-
[13] Zaharia, M., Chowdhury, M., Franklin, M. J., Shenker, S., & Stoica, I. (2010). Spark: cluster computing with working sets. HotCloud, 10, 10-10.
-
[14] Qiu, H., Gu, R., Yuan, C., & Huang, Y. (2014, May). YAFIM: A Parallel Frequent Itemset Mining Algorithm with Spark. In Parallel & Distributed Processing Symposium Workshops (IPDPSW), 2014 IEEE International (pp. 1664-1671). IEEE.
-
[15] Gu, L., & Li, H. (2013, November). Memory or time: Performance evaluation for iterative operation on hadoop and spark. In High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing (HPCC_EUC), 2013 IEEE 10th International Conference on (pp. 721-727). IEEE.
-
[16] Shvachko, K., Kuang, H., Radia, S., & Chansler, R. (2010, May). The hadoop distributed file system. In Mass storage systems and technologies (MSST), 2010 IEEE 26th symposium on (pp. 1-10). IEEE.
-
[17] Vora, M. N. (2011, December). Hadoop-HBase for large-scale data. In Computer science and network technology (ICCSNT), 2011 international conference on (Vol. 1, pp. 601-605). IEEE.
-
[18] Palankar, M. R., Iamnitchi, A., Ripeanu, M., & Garfinkel, S. (2008, June). Amazon S3 for science grids: a viable solution?. In Proceedings of the 2008 international workshop on Data-aware distributed computing (pp. 55-64). ACM.
-
[21] Everitt, B. S., & Dunn, G. (1993). Principal components analysis. Applied Multivariate Data Analysis, Second Edition, 48-73.
-
[22] Li, H., Wang, Y., Zhang, D., Zhang, M., & Chang, E. Y. (2008, October). Pfp: parallel fp-growth for query recommendation. In Proceedings of the 2008 ACM conference on Recommender systems (pp. 107-114). ACM.
-
[23] Kutner, M. H., Nachtsheim, C., & Neter, J. (2004). Applied linear regression models. McGraw-Hill/Irwin.
-
[24] Armbrust, M., Xin, R. S., Lian, C., Huai, Y., Liu, D., Bradley, J. K., ... & Zaharia, M. (2015, May). Spark sql: Relational data processing in spark. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data(pp. 1383-1394). ACM.
-
[25] Zaharia, M., Das, T., Li, H., Shenker, S., & Stoica, I. (2012, June). Discretized streams: an efficient and fault-tolerant model for stream processing on large clusters. In Presented as part of the.
-
[26] Thusoo, A., Sarma, J. S., Jain, N., Shao, Z., Chakka, P., Anthony, S., ... & Murthy, R. (2009). Hive: a warehousing solution over a map-reduce framework. Proceedings of the VLDB Endowment, 2(2), 1626-1629.
-
[27] Hamilton, G., Cattell, R., & Fisher, M. (1997). JDBC Database Access with Java(Vol. 7). Addison Wesley.
-
[28] Thein, K. M. M. (2014). Apache kafka: Next generation distributed messaging system. International Journal of Scientific Engineering and Technology Research, 3(47), 9478-9483.
-
[29] Hoffman, S. (2013). Apache Flume: Distributed Log Collection for Hadoop. Packt Publishing Ltd.
-
[30] Sakaki, T., Okazaki, M., & Matsuo, Y. (2010, April). Earthquake shakes Twitter users: real-time event detection by social sensors. In Proceedings of the 19th international conference on World wide web (pp. 851-860). ACM.
-
[32] Xin, R. S., Gonzalez, J. E., Franklin, M. J., & Stoica, I. (2013, June). Graphx: A resilient distributed graph system on spark. In First International Workshop on Graph Data Management Experiences and Systems (p. 2). ACM.
-
[33] Malewicz, G., Austern, M. H., Bik, A. J., Dehnert, J. C., Horn, I., Leiser, N., & Czajkowski, G. (2010, June). Pregel: a system for large-scale graph processing. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of data (pp. 135-146). ACM.
-
[35] Meng, X., Bradley, J., Yavuz, B., Sparks, E., Venkataraman, S., Liu, D., ... & Xin, D. (2016). Mllib: Machine learning in apache spark. The Journal of Machine Learning Research, 17(1), 1235-1241.
-
[37] Son, J., Jung, I., Park, K., & Han, B. (2015). Tracking-by-Segmentation With Online Gradient Boosting Decision Tree. In Proceedings of the IEEE International Conference on Computer Vision (pp. 3056-3064).
-
[39] http://spark.apache.org/docs/latest/ml-guide.html#announcement-dataframe-based-api-is-primary-api
-
[40] Salperwyck, C., Maby, S., Cubillé, J., & Lagacherie, M. (2015, September). CourboSpark: Decision Tree for Time-series on Spark. In AALTD@ PKDD/ECML.
-
[41] Kaveh, M. (2015). ETL and Analysis of IoT data using OpenTSDB, Kafka, and Spark (Master's thesis, University of Stavanger, Norway).
-
[42] Arora, S. (2017). Analyzing mobile phone usage using clustering in Spark MLLib and Pig. International Journal, 8(1).
-
[47] Rajaratnam, B., Roberts, S., Sparks, D., & Dalal, O. (2016). Lasso regression: estimation and shrinkage via the limit of Gibbs sampling. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 78(1), 153-174.
-
[51] Son, J., Jung, I., Park, K., & Han, B. (2015). Tracking-by-Segmentation With Online Gradient Boosting Decision Tree. In Proceedings of the IEEE International Conference on Computer Vision (pp. 3056-3064).
-
[55] Maarala, A. I., Rautiainen, M., Salmi, M., Pirttikangas, S., & Riekki, J. (2015, October). Low latency analytics for streaming traffic data with Apache Spark. In Big Data (Big Data), 2015 IEEE International Conference on (pp. 2855-2858). IEEE.
-
[57] Awan, A. J., Brorsson, M., Vlassov, V., & Ayguade, E. (2016). Architectural Impact on Performance of In-memory Data Analytics: Apache Spark Case Study. arXiv preprint arXiv:1604.08484.
-
[59] Kulkarni, S. (2015). A Recommendation Engine Using Apache Spark.
-
[61] Hindman, B., Konwinski, A., Zaharia, M., Ghodsi, A., Joseph, A. D., Katz, R. H., ... & Stoica, I. (2011, March). Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center. In NSDI (Vol. 11, No. 2011, pp. 22-22).
-
[62] Ghodsi, A., Zaharia, M., Hindman, B., Konwinski, A., Shenker, S., & Stoica, I. (2011, March). Dominant Resource Fairness: Fair Allocation of Multiple Resource Types. In NSDI (Vol. 11, No. 2011, pp. 24-24).
-
[65] Wang, S. C., Yan, K. Q., Liao, W. P., & Wang, S. S. (2010, July). Towards a load balancing in a three-level cloud computing network. In Computer Science and Information Technology (ICCSIT), 2010 3rd IEEE International Conference on(Vol. 1, pp. 108-113). IEEE.
-
[67] He, X., Sun, X., & Von Laszewski, G. (2003). QoS guided min-min heuristic for grid task scheduling. Journal of Computer Science and Technology, 18(4), 442-451.
-
[68] He, X., Sun, X., & Von Laszewski, G. (2003). QoS guided min-min heuristic for grid task scheduling. Journal of Computer Science and Technology, 18(4), 442-451.
-
[69] Bhoi, U., & Ramanuj, P. N. (2013). Enhanced max-min task scheduling algorithm in cloud computing. International Journal of Application or Innovation in Engineering and Management (IJAIEM), 2(4), 259-264.
-
[72] Parsa, S., & Entezari-Maleki, R. (2009). RASA: A new task scheduling algorithm in grid environment. World Applied sciences journal, 7(Special issue of Computer & IT), 152-160.
-
[75] Manjaly, J. S., & Chooralil, V. S. (2013, August). Tasktracker aware scheduling for hadoop mapreduce. In Advances in Computing and Communications (ICACC), 2013 Third International Conference on (pp. 278-281). IEEE.
-
[76] Hadoop's Capacity Scheduler http://hadoop.apache.org/core/docs/current/ capacity-scheduler.html.
-
[77] Ferguson, A. D., Bodik, P., Kandula, S., Boutin, E., & Fonseca, R. (2012, April). Jockey: guaranteed job latency in data parallel clusters. In Proceedings of the 7th ACM european conference on Computer Systems (pp. 99-112). ACM.
-
[81] Assunção, M. D., Calheiros, R. N., Bianchi, S., Netto, M. A., & Buyya, R. (2015). Big Data computing and clouds: Trends and future directions. Journal of Parallel and Distributed Computing, 79, 3-15.
-
[82] Zhang, Q., Zhani, M. F., Zhang, S., Zhu, Q., Boutaba, R., & Hellerstein, J. L. (2012, September). Dynamic energy-aware capacity provisioning for cloud computing environments. In Proceedings of the 9th international conference on Autonomic computing (pp. 145-154). ACM.
-
[84] Kopytov, A. (2004). SysBench: a system performance benchmark. URL: http://sysbench. sourceforge. net.
-
[86] Leonard Kaufman and J. Peter Rousseeuw, "Clustering by means of Medoids," in Statistical Data Analysis Based on the L_1–Norm and Related Methods, 1987, pp. 405–416.
-
[87] Leonard Kaufman and Peter J. Rousseuw, Finding Groups in Data: An Introduction to Cluster Analysis.: Wiley, 1990.
-
[88] Wen-Hao Cheng, "The development of an intelligent, cloud-basedremote monitoring management system," 國立中山大學電機工程學系研究所, 碩士論文 2012.
-
[89] Sachdeva, H. K., & Jain, A. (2017). The Survey of Enhanced Min-Max Approach For Resource Aware Job Scheduling In Cloud Computing. International Journal, 8(4).
-
[91] Poess, M., Nambiar, R. O., & Walrath, D. (2007, September). Why you should run TPC-DS: a workload analysis. In Proceedings of the 33rd international conference on Very large data bases (pp. 1138-1149). VLDB Endowment.
-
[93] Puurula, A., Read, J., & Bifet, A. (2014). Kaggle LSHTC4 winning solution. arXiv preprint arXiv:1405.0546.
-
[94] Bholowalia, P., & Kumar, A. (2014). EBK-means: A clustering technique based on elbow method and k-means in WSN. International Journal of Computer Applications, 105(9).
|