성능휴대성

성능 휴대성은 컴퓨터 프로그램과 응용 프로그램이 서로 다른 플랫폼에서 효과적으로 작동하는 기능을 말합니다. 성능 휴대용 애플리케이션의 개발자들은 성능을 저해하지 않고, 플랫폼별 코드를 최소화하면서 여러 플랫폼을 지원하는 것을 추구합니다.^[1]

HPC(고성능 컴퓨팅) 커뮤니티에서 찾고 있는 제품이지만 이를 측정하는 방법에 대해 보편적이거나 합의된 것은 없습니다. 휴대성이 애플리케이션의 휴대성을 의미하는지 소스 코드의 휴대성을 의미하는지에 대해서는 약간의 논란이 있습니다.

성능은 두 가지 방법으로 측정할 수 있습니다: 애플리케이션의 최적화된 버전을 휴대용 버전과 비교하거나, FLOP가 얼마나 많이 수행되었는지에 따라 애플리케이션의 이론적인 최고 성능을 메인 메모리에서 프로세서로 이동한 데이터와 비교하는 것입니다.

하드웨어의 다양성으로 인해 다양한 기계에서 작동하는 소프트웨어를 개발하는 것이 애플리케이션의 수명을 위해 점점 더 중요해지고 있습니다.

컨티션스

성능 휴대성이라는 용어는 업계에서 자주 사용되며 일반적으로 다음을 말합니다: "(1) 여러 하드웨어 플랫폼에서 하나의 애플리케이션을 실행할 수 있는 기능, 그리고 (2) 이러한 플랫폼에서 어느 정도의 성능 수준을 달성할 수 있습니다."^[1] 예를 들어, 2016년 DOE(미국 에너지부) 우수 성능 휴대성 센터 회의에서 John Pennycook(인텔)은 "애플리케이션은 일정한 수준의 성능을 달성할 경우 휴대성이 있습니다. 예를 들어, 실행 시간이나 다른 성능 수치에 의해 정의됩니다. 각 플랫폼에서 가장 잘 알려진 구현과 비교하여 플랫폼 전체에서 최대 FLOPS(초당 부동 소수점 작업)의 비율이 아닙니다."^[2] 좀 더 직접적으로 제프 라킨(NVIDIA)은 성능 휴대성이 "동일한 소스 코드가 다양한 아키텍처에서 생산적으로 실행될 것"이라고 언급했습니다.^[2]

성능 휴대성은 HPC(고성능 컴퓨팅) 커뮤니티 내에서 핵심적인 논의 주제입니다. 2016년에 시작된 HPC 포럼의 ^[3]성과, 휴대성, 생산성(Performance, Portability, and Productivity)에서 산업체, 학계, DOE 국립 연구소의 협력자들이 매년 만나 현재와 미래의 HPC 플랫폼에 대한 성과 휴대성 목표를 향한 아이디어와 진행 상황을 논의합니다.

성능 휴대성 측정

프로그램이 성능 휴대성에 도달하는 시점을 정량화하는 것은 두 가지 요인에 따라 달라집니다. 첫 번째 요소인 휴대성은 여러 아키텍처에서 사용되는 총 코드 라인 대 단일 아키텍처를 대상으로 하는 총 코드 라인으로 측정할 수 있습니다.^[14]^[1] 휴대성이 응용 프로그램의 휴대성(즉, 모든 곳에서 실행되는지 여부)을 의미하는지, 소스 코드의 휴대성(즉, 얼마나 많은 코드가 전문화되어 있는지)을 의미하는지에 대한 논란이 있습니다. 두 번째 요인인 성능은 몇 가지 방법으로 측정할 수 있습니다. 한 가지 방법은 플랫폼 최적화 버전의 애플리케이션의 성능과 동일한 애플리케이션의 휴대용 버전의 성능을 비교하는 것입니다.^[14]^[1] 또 다른 방법은 프로그램 실행 과정에서 메인 메모리에서 프로세서로 이동된 데이터와 비교하여 수행된 FLOP의 수에 따라 애플리케이션의 이론적 피크 성능을 제공하는 루프라인 성능 모델을 구성하는 것입니다.^[15]

현재 코드 또는 애플리케이션 성능을 실제로 휴대할 수 있는 것이 무엇인지에 대한 보편적인 표준은 없으며, 제안된 측정 방법이 코드 팀과 관련된 문제를 정확하게 포착하는지에 대한 합의도 없습니다. 2016년 DOE(미국 에너지부) 우수 성능 휴대성 센터 회의에서 로렌스 리버모어 국립 연구소의 데이비드 리차드(David Richards) 연사는 "응용 프로그램 팀이 성능 휴대성을 말할 때 코드는 성능 휴대성입니다!"^[2]라고 말했습니다.

2019년에 발표된 "다양한 컴퓨터 아키텍처에 걸친 성능 휴대성"이라는 제목의 연구는 성능 휴대성의 현재 상태를 결정하기 위해 다양한 아키텍처 세트에 걸쳐 여러 병렬 프로그래밍 모델을 분석했습니다. 이 연구는 성능 휴대용 코드를 작성할 때 여러 하드웨어 플랫폼에 걸쳐 여러 벤더가 지원하는 개방형(표준) 프로그래밍 모델을 사용하고, 알고리즘 및 애플리케이션의 모든 수준에서 최대 병렬성을 노출하며, 여러 플랫폼에서 동시에 코드를 개발하고 개선하는 것이 중요하다고 결론지었습니다. 그리고 다중 objective 자동 조정을 통해 유연한 코드 기반에서 적절한 파라미터를 찾아 모든 플랫폼에서 우수한 성능을 발휘할 수 있습니다.

2022년의 연구는 병렬 응용 프로그램의 성능 휴대성에 대한 적절하고 포괄적인 정의가 바람직하지만 다소 복잡하다고 가정하며, 이러한 정의가 과학계의 대부분의 연구자와 개발자들에게 받아들여질지는 의문입니다. 또한, 지난 20년 동안 병렬 프로그래밍 모델 개발에서 발생한 변화, 특히 현재 버전에 새로운 휴대용 성능 추상화가 추가되고 향후 몇 년 동안 추가될 것으로 예상되는 것들은 이 분야의 새로운 트렌드를 설명합니다. 이러한 추세는 병렬 프로그래밍 모델이 애플리케이션에 제공할 성능 휴대성이 애플리케이션이 자체적으로 제공할 수 있는 성능 휴대성보다 더 중요할 것임을 나타냅니다. 즉, 병렬 프로그래밍 모델이 규정 모델보다 더 설명적이 되어 프로그래머에서 프로그래밍 모델 구현 및 그 기본 컴파일러로 많은 책임을 전가할 것으로 제안되며, 이는 궁극적으로 애플리케이션의 성능 휴대성 정도를 결정합니다. 이것은 가까운 미래에 애플리케이션을 개발하는 방법의 근본적인 개념 변화입니다. 이러한 변화의 결과로 성능 휴대성 정의의 추상화 수준을 높일 필요가 있습니다. 즉, 이러한 연구는 애플리케이션 중심이 아닌 병렬 프로그래밍 모델 중심의 성능 휴대성에 대한 정의를 제안합니다.

프레임워크 및 비프레임 솔루션

프로그래머가 애플리케이션의 성능을 휴대할 수 있도록 도와주는 많은 프로그래밍 애플리케이션과 시스템이 있습니다. 기능적 휴대성을 지원한다고 주장하는 일부 프레임워크에는 OpenCL, SYCL, Kokkos, RAJA, Java, OpenMP, OpenACC가 있습니다. 이러한 프로그래밍 인터페이스는 특정 프로그래밍 언어에서 다중 플랫폼 다중 처리 프로그래밍을 지원합니다.^[19] 일부 비프레임 솔루션에는 자체 조정 및 도메인별 언어가 포함됩니다.

참고문헌

^ ^a ^b ^c ^d Pennycook, John; Sewall, Jason; Lee, Victor (8 August 2017). "Implications of a metric for performance portability". Future Generation Computer Systems. 92: 947–958. doi:10.1016/j.future.2017.08.007. S2CID 57380225. Archived from the original on 13 April 2019. Retrieved 10 October 2021 – via ScienceDirect.
^ ^a ^b ^c ^d ^e Neely, Rob J. (April 21, 2016). "DOE Centers of Excellence Performance Portability Meeting". U.S. Department of Energy: Office of Scientific and Technical Information. doi:10.2172/1332474. OSTI 1332474. Archived from the original on July 27, 2021. Retrieved July 27, 2021.
^ "P3HPC: Performance, Portability & Productivity in HPC". P3HPC.
^ Matthias, Jacob; Randall, Keith (2002). "Cross-Architectural Performance Portability of a Java Virtual Machine Implementation". USENIX (The Advanced Computing Systems Association). Archived from the original on 2021-10-10. Retrieved 2021-10-10.
^ Edwards, H. Carter; Sunderland, Daniel; Porter, Vicki; Amsler, Chris; Mish, Sam (2012). "Manycore Performance-Portability: Kokkos Multidimensional Array Library". Scientific Programming. 20 (2): 89–114. doi:10.3233/SPR-2012-0343. ISSN 1058-9244. Archived from the original on 2021-10-10. Retrieved 2021-10-10 – via Hindawi.
^ Bosilca, George; Bouteiller, Aurelien; Herault, Thomas; Lemariner, Pierre; Saengpatsa, Narapat; Tomov, Stanimire; Dongarra, Jack (2011). "Performance Portability of a GPU Enabled Factorization with the DAGuE Framework". IEEE Cluster: Workshop on Parallel Programming on Accelerator Clusters (PPAC): 1–8. Archived from the original on 2021-10-10. Retrieved 2021-10-10 – via The University of Texas.
^ Nemire, Brad (2015-10-29). "Performance Portability for GPUs and CPUs with OpenACC". NVIDIA Developer Blog. Archived from the original on 2021-10-10. Retrieved 2021-10-10.
^ ^a ^b Howard, Micah; Bradley, Andrew Michael; Bova, Steven W.; Overfelt, James R.; Wagnild, Ross Martin; Dinzl, Derek John; Hoemmen, Mark Frederick; Klinvex, Alicia Marie (2017-06-01). "Towards a Performance Portable Compressible CFD Code". OSTI 1458230. Archived from the original on 2021-10-10. Retrieved 2021-10-10. {{cite journal}}: 저널 인용 요구사항 journal= (도와주세요)
^ McCool, Michael D. (2012). Structured parallel programming : patterns for efficient computation. James Reinders, Arch Robison. Amsterdam: Elsevier/Morgan Kaufmann. ISBN 978-0-12-391443-9. OCLC 798575627. Archived from the original on 2023-01-17. Retrieved 2021-10-10.
^ Hemsoth, Nicole (2020-11-19). ""Wombat" Puts Arm's SVE Instruction Set to the Test". The Next Platform. Archived from the original on 2021-06-24.
^ "NERSC, ALCF, Codeplay Partner on SYCL GPU Compiler". insideHPC. 2021-03-01. Archived from the original on 2021-10-10.
^ Marques, Osni (9 December 2020). "Software Design for Longevity with Performance Portability". Exascale Computing Project. Archived from the original on 27 July 2021.
^ ^a ^b "DOE COE Performance Portability Meeting 2017". www.lanl.gov. August 2017. Archived from the original on 2021-07-27.
^ ^a ^b "Measurement Techniques - Performance Portability". performanceportability.org. Archived from the original on 2021-04-13.
^ "Quantitatively Assessing Performance Portability with Roofline". Exascale Computing Project. 23 January 2019. Archived from the original on 27 July 2021.
^ Deakin, Tom J.; McIntosh-Smith, Simon N.; Price, James; Poenaru, Andrei; Atkinson, Patrick R.; Popa, Codrin; Salmon, Justin (2020-01-02). "Performance Portability across Diverse Computer Architectures". 2019 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC). Institute of Electrical and Electronics Engineers (IEEE). pp. 1–13. doi:10.1109/P3HPC49587.2019.00006. hdl:1983/5c8737cc-f4ee-44c3-b55d-1f31d3fcc135. ISBN 978-1-7281-6003-0. S2CID 208973842. Archived from the original on 2021-10-10. Retrieved 2021-10-10 – via University of Bristol.
^ Marowka, Ami (12 January 2022). "On the Performance Portability of OpenACC, OpenMP, Kokkos and RAJA". International Conference on High Performance Computing in Asia-Pacific Region. HPCAsia2022. pp. 103–114. doi:10.1145/3492805.3492806. ISBN 9781450384988. S2CID 245803003. Archived from the original on 5 February 2022. Retrieved 4 February 2022 – via ACM Digital Library.
^ Marowka, Ami (January 2022). "Reformulation of the performance portability metric". Software: Practice and Experience. 52 (1): 154–171. doi:10.1002/spe.3002. S2CID 236313509. Archived from the original on 2022-02-04. Retrieved 2022-02-04 – via Wiley Digital online.
^ "OpenMP About Us". OpenMP. Archived from the original on 2021-08-05.

외부 링크

서지학

Exascale Scientific Applications: 확장성 및 성능 휴대성. 영국, CRC 출판사, 2017.
Mazaheri A., Schulte J., Moskewicz M.W., Wolf F., Jannesari A. (2019) GPU 텐서 연산의 프로그래밍성과 성능 휴대성 향상. 인: Yahyapour R. (eds) Euro-Par 2019: 병렬 처리. 유로파 2019. 컴퓨터 과학 강의 노트, 권 11725 스프링어, 참. https://doi.org/10.1007/978-3-030-29400-7_16

[:1-1] Pennycook, John; Sewall, Jason; Lee, Victor (8 August 2017). "Implications of a metric for performance portability". Future Generation Computer Systems. 92: 947–958. doi:10.1016/j.future.2017.08.007. S2CID 57380225. Archived from the original on 13 April 2019. Retrieved 10 October 2021 – via ScienceDirect.

[:0-2] Neely, Rob J. (April 21, 2016). "DOE Centers of Excellence Performance Portability Meeting". U.S. Department of Energy: Office of Scientific and Technical Information. doi:10.2172/1332474. OSTI 1332474. Archived from the original on July 27, 2021. Retrieved July 27, 2021.

[3] "P3HPC: Performance, Portability & Productivity in HPC". P3HPC.

[4] Matthias, Jacob; Randall, Keith (2002). "Cross-Architectural Performance Portability of a Java Virtual Machine Implementation". USENIX (The Advanced Computing Systems Association). Archived from the original on 2021-10-10. Retrieved 2021-10-10.

[5] Edwards, H. Carter; Sunderland, Daniel; Porter, Vicki; Amsler, Chris; Mish, Sam (2012). "Manycore Performance-Portability: Kokkos Multidimensional Array Library". Scientific Programming. 20 (2): 89–114. doi:10.3233/SPR-2012-0343. ISSN 1058-9244. Archived from the original on 2021-10-10. Retrieved 2021-10-10 – via Hindawi.

[6] Bosilca, George; Bouteiller, Aurelien; Herault, Thomas; Lemariner, Pierre; Saengpatsa, Narapat; Tomov, Stanimire; Dongarra, Jack (2011). "Performance Portability of a GPU Enabled Factorization with the DAGuE Framework". IEEE Cluster: Workshop on Parallel Programming on Accelerator Clusters (PPAC): 1–8. Archived from the original on 2021-10-10. Retrieved 2021-10-10 – via The University of Texas.

[7] Nemire, Brad (2015-10-29). "Performance Portability for GPUs and CPUs with OpenACC". NVIDIA Developer Blog. Archived from the original on 2021-10-10. Retrieved 2021-10-10.

[:2-8] Howard, Micah; Bradley, Andrew Michael; Bova, Steven W.; Overfelt, James R.; Wagnild, Ross Martin; Dinzl, Derek John; Hoemmen, Mark Frederick; Klinvex, Alicia Marie (2017-06-01). "Towards a Performance Portable Compressible CFD Code". OSTI 1458230. Archived from the original on 2021-10-10. Retrieved 2021-10-10. {{cite journal}}: 저널 인용 요구사항 journal= (도와주세요)

[9] McCool, Michael D. (2012). Structured parallel programming : patterns for efficient computation. James Reinders, Arch Robison. Amsterdam: Elsevier/Morgan Kaufmann. ISBN 978-0-12-391443-9. OCLC 798575627. Archived from the original on 2023-01-17. Retrieved 2021-10-10.

[10] Hemsoth, Nicole (2020-11-19). ""Wombat" Puts Arm's SVE Instruction Set to the Test". The Next Platform. Archived from the original on 2021-06-24.

[11] "NERSC, ALCF, Codeplay Partner on SYCL GPU Compiler". insideHPC. 2021-03-01. Archived from the original on 2021-10-10.

[12] Marques, Osni (9 December 2020). "Software Design for Longevity with Performance Portability". Exascale Computing Project. Archived from the original on 27 July 2021.

[:3-13] "DOE COE Performance Portability Meeting 2017". www.lanl.gov. August 2017. Archived from the original on 2021-07-27.

[:4-14] "Measurement Techniques - Performance Portability". performanceportability.org. Archived from the original on 2021-04-13.

[15] "Quantitatively Assessing Performance Portability with Roofline". Exascale Computing Project. 23 January 2019. Archived from the original on 27 July 2021.

[16] Deakin, Tom J.; McIntosh-Smith, Simon N.; Price, James; Poenaru, Andrei; Atkinson, Patrick R.; Popa, Codrin; Salmon, Justin (2020-01-02). "Performance Portability across Diverse Computer Architectures". 2019 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC). Institute of Electrical and Electronics Engineers (IEEE). pp. 1–13. doi:10.1109/P3HPC49587.2019.00006. hdl:1983/5c8737cc-f4ee-44c3-b55d-1f31d3fcc135. ISBN 978-1-7281-6003-0. S2CID 208973842. Archived from the original on 2021-10-10. Retrieved 2021-10-10 – via University of Bristol.

[:18-17] Marowka, Ami (12 January 2022). "On the Performance Portability of OpenACC, OpenMP, Kokkos and RAJA". International Conference on High Performance Computing in Asia-Pacific Region. HPCAsia2022. pp. 103–114. doi:10.1145/3492805.3492806. ISBN 9781450384988. S2CID 245803003. Archived from the original on 5 February 2022. Retrieved 4 February 2022 – via ACM Digital Library.

[:19-18] Marowka, Ami (January 2022). "Reformulation of the performance portability metric". Software: Practice and Experience. 52 (1): 154–171. doi:10.1002/spe.3002. S2CID 236313509. Archived from the original on 2022-02-04. Retrieved 2022-02-04 – via Wiley Digital online.

[19] "OpenMP About Us". OpenMP. Archived from the original on 2021-08-05.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[19]

Search