توسعه یادگیری تقویتی پیوسته در مسائل مکانی توزیع یافته (مورد مطالعاتی: کنترل هوشمند چراغ های راهنمایی)

محمد اصلانی; محمدسعدی مسگری

توسعه یادگیری تقویتی پیوسته در مسائل مکانی توزیع یافته (مورد مطالعاتی: کنترل هوشمند چراغ های راهنمایی)

محل انتشار: فصلنامه مهندسی برق و الکترونیک ایران، دوره: 17، شماره: 3

سال انتشار: 1399

نوع سند: مقاله ژورنالی

زبان: فارسی

مشاهده: 365

فایل این مقاله در 16 صفحه با فرمت PDF قابل دریافت می باشد

دریافت فایل کامل مقاله

صدور گواهی نمایه سازی
من نویسنده این مقاله هستم

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

https://civilica.com/doc/1157344

شناسه ملی سند علمی:

JR_JIAE-17-3_007

تاریخ نمایه سازی: 3 اسفند 1399

چکیده مقاله:

سامانه های چند عامله به عنوان شاخه ای از هوش مصنوعی در سال های اخیر به عنوان یک نگرش برای مطالعه، بررسی و تحلیل پدیده هایی که دارای خصوصیاتی همچون توزیع یافتگی، پیچیدگی، پایین به بالا بودن و پویایی هستند در زمینه های مختلف از جمله ترافیک، حمل و نقل، اقتصاد، محیط زیست و مواردی از این دست به طور گسترده بکار گرفته شده اند. چالش اصلی در سامانه های چند عامله بدست آوردن رفتار مناسب برای تک تک عامل ها برای رسیدن به رفتار سطح بالای بهینه برای کل سامانه است. یادگیری تقویتی به عنوان رویکردی مناسب که به صورت خودکار و تدریجی می تواند رفتار بهینه را برای تمام عامل ها در تعامل با محیط بدست آورد،برای حل این چالش مناسب است. در یادگیری تقویتی عامل ها در طول زمان از طریق تعامل با محیط یاد میگیرند که در شرایط (حالات) مختلف چه اعمالی را انجام دهند که منجر به دریافت بیشترین سود شود. روش های رایج یادگیری تقویتی در مسائل دنیای واقعی که دارای تعداد حالات محیط بسیار بالا یا بی نهایت هستند عملکرد مناسبی ندارند زیرا این روش ها مقداری مجزا را برای ارزش هر زوج حالت-عمل در حافظه اختصاص می دهند وعامل برای بدست آوردن مقدار دقیق ارزش هر زوج حالت-عمل نیاز دارد که به دفعات ارزش آنها را مشاهده نماید. نوآوری تحقیق حاضر،حل چالش فوق از طریق یادگیری تقویتی پیوسته در مسائل مکانی با فضای حالت-عمل بزرگ و پیوسته است. در رویکرد یادگیری تقویتی پیوسته از مفهوم تعمیم برای تخمین ارزش حالت-عمل استفاده می شود. در این روش عامل نیازی به تجربه اندوزی مستقیم در تمام حالات محیط را ندارد و ارزش یک حالت با شباهت سنجی از ارزش سایر حالات مشابه، تخمین زده می شود. این روش ها برای شباهت سنجی نیاز به کد گذاری حالات محیط دارند که در این تحقیق ناحیه بندی فضا که دارای حجم محاسباتی پایینی است مورد استفاده قرار گرفت. در این تحقیق کنترل ترافیک (به طور خاص مدیریت چراغ های راهنمایی) که دارای پویایی و پیچیدگی بالایی است به عنوان مورد مطالعاتی مطلوب انتخاب شد.

کلیدواژه ها:

Single-phase asynchronous motor ، SPIM ، Rotor field oriented control ، Speed estimation ، Extended Kalman filter ، EKF ، On-line rotor resistance estimation ، سامانه های چند عامله ، یادگیری تقویتی پیوسته ، ناحیه بندی فضا و کنترل ترافیک.

نویسندگان

محمد اصلانی

Department of Geospatial Information System (GIS), Faculty of Geodesy and Geomatics Eng. K.N.Toosi University of Technology

محمدسعدی مسگری

Department of Geospatial Information System (GIS), Faculty of Geodesy and Geomatics Eng. K.N.Toosi University of Technology

مراجع و منابع این مقاله:

لیست زیر مراجع و منابع استفاده شده در این مقاله را نمایش می دهد. این مراجع به صورت کاملا ماشینی و بر اساس هوش مصنوعی استخراج شده اند و لذا ممکن است دارای اشکالاتی باشند که به مرور زمان دقت استخراج این محتوا افزایش می یابد. مراجعی که مقالات مربوط به آنها در سیویلیکا نمایه شده و پیدا شده اند، به خود مقاله لینک شده اند :

[1] By, R. A. d., Georgiadou, P. Y., Knippers, R. A., ...
[2] Ligtenberg, A., Wachowicz, M., Bregt, A. K., Beulens, A., and ...
[3] Bazzan, A. L. C., “A Distributed Approach for Coordination of ...
[4] Crooks, A., Castle, C., and Batty, M., “Key challenges in ...
[5] Tian, G., Ouyang, Y., Quan, Q., and Wu, J., “Simulating ...
[6] Shoham, Y., and Leyton-Brown, K., Multiagent Systems: Algorithmic, Game Theoretic ...
[7] Weiss, G., Multiagent Systems: A Modern Approach to Distributed Artificial ...
[8] Sutton, R. S., and Barto, A. G., Introduction to Reinforcement ...
[9] Schwartz, H. M., Multi-Agent Machine Learning A Reinforcement Approach, New ...
[10] Abdulhai, B., and Kattan, L., “Reinforcement learning: Introduction to theory ...
[11] Beigy, H., and Meybodi, M. R., “Utilizing Distributed Learning Automata ...
[12] Wang, X., Liu, L., and Eck, J., "Crime Simulation Using ...
[13] Tang, W., “Simulating Complex Adaptive Geographic Systems: AGeographically Aware Intelligent ...
[14] Bone, C., and Dragicevic, S., “GIS and Intelligent Agents for ...
[15] Bone, C., and Dragicevic, S., “Defining Transition Rules with Reinforcement ...
[16] Bone, C., and Dragićević, S., “Incorporating spatio-temporal knowledge in an ...
[17] Choy, M. C., Srinivasan, D., and Cheu, R. L., “Cooperative, ...
[18] Medina, J. C., Hajbabaie, A., and Benekohal, R. F., “Arterial ...
[19] Jacob, C., and Abdulhai, B., “Machine learning for multi-jurisdictional optimaltraffic ...
[20] Houli, D., Zhiheng, L., and Yi, Z., “Multiobjective Reinforcement Learning ...
[21] Sutton, R. S., and Barto, A. G., Reinforcement Learning: An ...
[22] Szepesvári, C., Algorithms for Reinforcement Learning, Canada: Morgan & Claypool ...
[23] Szepesvári, C., Algorithms for Reinforcement Learning: Morgan & Claypool Publishers, ...
[24] Russell, S., and Norvig, P., Artificial Intelligence: A Modern Approach ...
[25] Slinn, M., Matthews, P., and Guest, P., Traffic Engineering Design ...
[26] وزارت‌نیرو, ترازنامه انرژی 1388. ...
[27] Morabia, A., Amstislavski, P. N., Mirer, F. E., Amstislavski, T. ...
[28] Timotheou, S., Panayiotou, C. G., and Polycarpou, M. M., "Transportation ...
[29] Sun, J., Ma, Z., Li, T., and Niu, D., “Development ...
[30] Burghout, W., “Hybrid microscopic mesoscopic traffic simulation,” Department of Infrastructure, ...
[31] Casas, J., Ferrer, J. L., Garcia, D., Perarnau, J., and ...
[32] Peters, B., and Nilsson, L., "Modelling the Driver in Control," ...
[33] Tao, R., Wei, H., Wang, Y., and Sisiopiku, V., “Modeling ...
[34] Moridpour, S., Sarvi, M., and Rose, G., “Modeling the lane ...
[35] Reuschel, A., “Vehicle movements in a platoon with uniform acceleration ...
[36] Pipes, L. A., “An Operational Analysis of Traffic Dynamics”, Journal ...
[37] Herman, R., and Potts, R. B., "Single Lane Traffic Flow ...
[38] Gipps, P. G., “A behavioural car-following model for computer simulation”, ...
[1] By, R. A. d., Georgiadou, P. Y., Knippers, R. A., ...
[2] Ligtenberg, A., Wachowicz, M., Bregt, A. K., Beulens, A., and ...
[3] Bazzan, A. L. C., “A Distributed Approach for Coordination of ...
[4] Crooks, A., Castle, C., and Batty, M., “Key challenges in ...
[5] Tian, G., Ouyang, Y., Quan, Q., and Wu, J., “Simulating ...
[6] Shoham, Y., and Leyton-Brown, K., Multiagent Systems: Algorithmic, Game Theoretic ...
[7] Weiss, G., Multiagent Systems: A Modern Approach to Distributed Artificial ...
[8] Sutton, R. S., and Barto, A. G., Introduction to Reinforcement ...
[9] Schwartz, H. M., Multi-Agent Machine Learning A Reinforcement Approach, New ...
[10] Abdulhai, B., and Kattan, L., “Reinforcement learning: Introduction to theory ...
[11] Beigy, H., and Meybodi, M. R., “Utilizing Distributed Learning Automata ...
[12] Wang, X., Liu, L., and Eck, J., "Crime Simulation Using ...
[13] Tang, W., “Simulating Complex Adaptive Geographic Systems: AGeographically Aware Intelligent ...
[14] Bone, C., and Dragicevic, S., “GIS and Intelligent Agents for ...
[15] Bone, C., and Dragicevic, S., “Defining Transition Rules with Reinforcement ...
[16] Bone, C., and Dragićević, S., “Incorporating spatio-temporal knowledge in an ...
[17] Choy, M. C., Srinivasan, D., and Cheu, R. L., “Cooperative, ...
[18] Medina, J. C., Hajbabaie, A., and Benekohal, R. F., “Arterial ...
[19] Jacob, C., and Abdulhai, B., “Machine learning for multi-jurisdictional optimaltraffic ...
[20] Houli, D., Zhiheng, L., and Yi, Z., “Multiobjective Reinforcement Learning ...
[21] Sutton, R. S., and Barto, A. G., Reinforcement Learning: An ...
[22] Szepesvári, C., Algorithms for Reinforcement Learning, Canada: Morgan & Claypool ...
[23] Szepesvári, C., Algorithms for Reinforcement Learning: Morgan & Claypool Publishers, ...
[24] Russell, S., and Norvig, P., Artificial Intelligence: A Modern Approach ...
[25] Slinn, M., Matthews, P., and Guest, P., Traffic Engineering Design ...
[26] وزارت‌نیرو, ترازنامه انرژی 1388. ...
[27] Morabia, A., Amstislavski, P. N., Mirer, F. E., Amstislavski, T. ...
[28] Timotheou, S., Panayiotou, C. G., and Polycarpou, M. M., "Transportation ...
[29] Sun, J., Ma, Z., Li, T., and Niu, D., “Development ...
[30] Burghout, W., “Hybrid microscopic mesoscopic traffic simulation,” Department of Infrastructure, ...
[31] Casas, J., Ferrer, J. L., Garcia, D., Perarnau, J., and ...
[32] Peters, B., and Nilsson, L., "Modelling the Driver in Control," ...
[33] Tao, R., Wei, H., Wang, Y., and Sisiopiku, V., “Modeling ...
[34] Moridpour, S., Sarvi, M., and Rose, G., “Modeling the lane ...
[35] Reuschel, A., “Vehicle movements in a platoon with uniform acceleration ...
[36] Pipes, L. A., “An Operational Analysis of Traffic Dynamics”, Journal ...
[37] Herman, R., and Potts, R. B., "Single Lane Traffic Flow ...
[38] Gipps, P. G., “A behavioural car-following model for computer simulation”, ...

نمایش کامل مراجع