{"id":438153,"date":"2024-10-20T08:02:34","date_gmt":"2024-10-20T08:02:34","guid":{"rendered":"https:\/\/pdfstandards.shop\/product\/uncategorized\/ieee-3302-2022\/"},"modified":"2024-10-26T15:06:33","modified_gmt":"2024-10-26T15:06:33","slug":"ieee-3302-2022","status":"publish","type":"product","link":"https:\/\/pdfstandards.shop\/product\/publishers\/ieee\/ieee-3302-2022\/","title":{"rendered":"IEEE 3302-2022"},"content":{"rendered":"
New IEEE Standard – Active. This standard adopts MPAI Technical Specification Version 1.4 as an IEEE Standard. The Moving Picture, Audio and Data Coding by Artificial Intelligence (MPAI) Technical Specification Context-based Audio Enhancement (CAE) Version 1.4 is a collection of four use cases specifying AI-based technologies for audio-related applications including entertainment, communication, post-production, teleconferencing, and restoration.<\/p>\n
PDF Pages<\/th>\n | PDF Title<\/th>\n<\/tr>\n | ||||||
---|---|---|---|---|---|---|---|
1<\/td>\n | IEEE Std 3302\u2122-2022 Front Cover <\/td>\n<\/tr>\n | ||||||
2<\/td>\n | Title page <\/td>\n<\/tr>\n | ||||||
4<\/td>\n | Important Notices and Disclaimers Concerning IEEE Standards Documents <\/td>\n<\/tr>\n | ||||||
8<\/td>\n | Paticipants <\/td>\n<\/tr>\n | ||||||
9<\/td>\n | Introduction <\/td>\n<\/tr>\n | ||||||
10<\/td>\n | Specification for MPAI-CAE <\/td>\n<\/tr>\n | ||||||
11<\/td>\n | Contents <\/td>\n<\/tr>\n | ||||||
13<\/td>\n | 1 Introduction (Informative) <\/td>\n<\/tr>\n | ||||||
14<\/td>\n | 2 Scope of standard <\/td>\n<\/tr>\n | ||||||
15<\/td>\n | 2.1 Emotion-Enhanced Speech (EES) 2.2 Audio Recording Preservation (ARP) 2.3 Speech Restoration System (SRS) 2.4 Enhanced Audioconference Experience (EAE) <\/td>\n<\/tr>\n | ||||||
16<\/td>\n | 2.5 Normative content of the Use Cases 3 Terms and Definitions <\/td>\n<\/tr>\n | ||||||
18<\/td>\n | 4 References 4.1 Normative References <\/td>\n<\/tr>\n | ||||||
19<\/td>\n | 4.2 Informative References 5 Use Case Architectures 5.1 Emotion-Enhanced Speech (EES) 5.1.1 Scope of Use Case 5.1.2 I\/O data <\/td>\n<\/tr>\n | ||||||
20<\/td>\n | 5.1.3 Implementation Architecture 5.1.4 AI Modules <\/td>\n<\/tr>\n | ||||||
21<\/td>\n | 5.2 Audio Recording Preservation (ARP) 5.2.1 Scope of Use Case 5.2.2 I\/O data 5.2.3 Implementation Architecture <\/td>\n<\/tr>\n | ||||||
23<\/td>\n | 5.2.4 AI Modules 5.3 Speech Restoration System (SRS) 5.3.1 Scope of Use Case <\/td>\n<\/tr>\n | ||||||
24<\/td>\n | 5.3.2 I\/O Data 5.3.3 Implementation Architecture <\/td>\n<\/tr>\n | ||||||
25<\/td>\n | 5.3.4 AI Modules 5.4 Enhanced Audioconference Experience (EAE) 5.4.1 Scope of Use Case <\/td>\n<\/tr>\n | ||||||
26<\/td>\n | 5.4.2 I\/O data 5.4.3 Implementation Architecture <\/td>\n<\/tr>\n | ||||||
27<\/td>\n | 5.4.4 AI Modules <\/td>\n<\/tr>\n | ||||||
28<\/td>\n | 6 AIMs 6.1 AIM Interoperability 6.2 AIMs and their data 6.2.1 Emotion Enhanced Speech 6.2.2 Audio Recording Preservation (ARP) <\/td>\n<\/tr>\n | ||||||
29<\/td>\n | 6.2.3 Speech Restoration System (SRS) 6.2.4 Enhanced Audioconference Experience (EAE) 6.3 Data Formats <\/td>\n<\/tr>\n | ||||||
30<\/td>\n | 6.3.1 Access Copy Files 6.3.2 Audio Scene Geometry 6.3.2.1 Syntax <\/td>\n<\/tr>\n | ||||||
31<\/td>\n | 6.3.2.2 Semantics <\/td>\n<\/tr>\n | ||||||
32<\/td>\n | 6.3.3 Damaged List 6.3.3.1 Syntax 6.3.3.2 Semantics 6.3.4 Denoised Speech <\/td>\n<\/tr>\n | ||||||
33<\/td>\n | 6.3.5 Editing List 6.3.5.1 Syntax <\/td>\n<\/tr>\n | ||||||
34<\/td>\n | 6.3.5.2 Semantics <\/td>\n<\/tr>\n | ||||||
35<\/td>\n | 6.3.6 Emotion 6.3.6.1 Syntax 6.3.6.2 Semantics <\/td>\n<\/tr>\n | ||||||
39<\/td>\n | 6.3.7 Emotionless Speech 6.3.8 Interleaved Multichannel Audio 6.3.9 Irregularity File 6.3.9.1 Syntax <\/td>\n<\/tr>\n | ||||||
40<\/td>\n | 6.3.9.2 Semantics <\/td>\n<\/tr>\n | ||||||
42<\/td>\n | 6.3.10 Irregularity Image 6.3.11 Microphone Array Audio 6.3.12 Microphone Array Geometry 6.3.12.1 Syntax <\/td>\n<\/tr>\n | ||||||
43<\/td>\n | 6.3.12.2 Semantics <\/td>\n<\/tr>\n | ||||||
44<\/td>\n | 6.3.13 Mode Selection <\/td>\n<\/tr>\n | ||||||
45<\/td>\n | 6.3.14 Multichannel Audio Stream 6.3.15 Neural Network Speech Model <\/td>\n<\/tr>\n | ||||||
46<\/td>\n | 6.3.16 Preservation Audio File 6.3.17 Preservation Audio-Visual File 6.3.18 Preservation Master Files 6.3.19 Source Dictionary 6.3.20 Source Model KB Query Format 6.3.21 Speech Features <\/td>\n<\/tr>\n | ||||||
47<\/td>\n | 6.3.21.1 Semantics <\/td>\n<\/tr>\n | ||||||
48<\/td>\n | 6.3.22 Spherical Harmonic Decomposition 6.3.23 Transform Denoised Speech 6.3.24 Transform Speech <\/td>\n<\/tr>\n | ||||||
49<\/td>\n | 6.3.25 Transform Multichannel Audio 6.3.26 Video <\/td>\n<\/tr>\n | ||||||
50<\/td>\n | Annex 1 MPAI-wide terms and definitions <\/td>\n<\/tr>\n | ||||||
53<\/td>\n | Annex 2 Notices and Disclaimers Concerning MPAI Standards (Informative) <\/td>\n<\/tr>\n | ||||||
55<\/td>\n | Annex 3 Patent Declarations <\/td>\n<\/tr>\n | ||||||
56<\/td>\n | Annex 4 Examples (Informative) A4.1 Audio Scene Geometry A4.2 Damaged List A4.3 Editing List <\/td>\n<\/tr>\n | ||||||
57<\/td>\n | A4.4 Irregularity File <\/td>\n<\/tr>\n | ||||||
58<\/td>\n | A4.5 Microphone Array Geometry <\/td>\n<\/tr>\n | ||||||
59<\/td>\n | A4.6 Speech Features 1 A4.7 Speech Features 2 <\/td>\n<\/tr>\n | ||||||
60<\/td>\n | Annex 5 AIW and AIM Metadata of CAE-EES A5.1 AIW Metadata <\/td>\n<\/tr>\n | ||||||
62<\/td>\n | A5.2 AIM Metadata A5.2.1 Speech Feature Analyser1 <\/td>\n<\/tr>\n | ||||||
63<\/td>\n | A5.2.2 Speech Feature Analyser2 <\/td>\n<\/tr>\n | ||||||
64<\/td>\n | A5.2.3 Emotion Feature Producer <\/td>\n<\/tr>\n | ||||||
65<\/td>\n | A5.2.4 Emotion Inserter1 A5.2.5 Emotion Inserter2 <\/td>\n<\/tr>\n | ||||||
67<\/td>\n | Annex 6 AIW and AIM Metadata of CAE-ARP A6.1 AIW metadata <\/td>\n<\/tr>\n | ||||||
71<\/td>\n | A6.2 AIM metadata A6.2.1 Audio Analyser <\/td>\n<\/tr>\n | ||||||
72<\/td>\n | A6.2.2 Video Analyser <\/td>\n<\/tr>\n | ||||||
73<\/td>\n | A6.2.3 Tape Irregularity classifier <\/td>\n<\/tr>\n | ||||||
74<\/td>\n | A6.2.4 Tape Audio Restoration <\/td>\n<\/tr>\n | ||||||
75<\/td>\n | A6.2.5 Packager <\/td>\n<\/tr>\n | ||||||
77<\/td>\n | Annex 7 AIW and AIM Metadata of CAE-SRS A7.1 AIW metadata <\/td>\n<\/tr>\n | ||||||
79<\/td>\n | A7.2 AIM metadata A7.2.1 Speech Model Creation A7.2.2 Speech Synthesiser <\/td>\n<\/tr>\n | ||||||
80<\/td>\n | A7.2.3 Assembler <\/td>\n<\/tr>\n | ||||||
82<\/td>\n | Annex 8 AIW and AIM Metadata of CAE-EAE A8.1 AIW Metadata <\/td>\n<\/tr>\n | ||||||
87<\/td>\n | A8.2 AIM Metadata A8.2.1 Analysis Transform <\/td>\n<\/tr>\n | ||||||
88<\/td>\n | A8.2.2 Sound Field Description <\/td>\n<\/tr>\n | ||||||
89<\/td>\n | A8.2.3 Speech Detection and Separation <\/td>\n<\/tr>\n | ||||||
90<\/td>\n | A8.2.4 Noise Cancellation <\/td>\n<\/tr>\n | ||||||
91<\/td>\n | A8.2.5 Synthesis Transform <\/td>\n<\/tr>\n | ||||||
92<\/td>\n | A8.2.6 Packager <\/td>\n<\/tr>\n | ||||||
94<\/td>\n | Back Cover <\/td>\n<\/tr>\n<\/table>\n","protected":false},"excerpt":{"rendered":" IEEE Standard Adoption of Moving Picture, Audio and Data Coding by Artificial Intelligence (MPAI) Technical Specification Context-based Audio Enhanced (CAE) Version 1.4 (Published)<\/b><\/p>\n |