Академический Документы
Профессиональный Документы
Культура Документы
Nezer J. Zaidenberg
Referances
For Threads - Chapters of APUE. For SDL and FFMpeg see their respectable help and documentation. There is a pretty good tutorial called how to write a video player in less then 1000 lines of code that can help help (but we dont use everything he shows there) don and he has a few bugs in his manual. (covered here)
Goals
We will discuss the decoding and encoding of rich media. (video and audio) This will introduce to us the problem of synching, which we will use to discuss threads and synchronisation This will all be covered in HW 2. As with networking. we dont care about video/audio don compression. we just learn how to use this environment.
IMPORTANT
Before we begin
The question is rich media handling part of the OS is handling open to debate. (Linux says it isnt. Windows says it is. isn EU says it isnt. USA says it is...) isn The Rich Media libraries we learn today are common in any Linux distribution but are not part of the Linux Kernel. The libraries are very portable and also work on Windows and used by players such as VLC. (but windows have other methods you COULD use)
The problem
Data quantities are huge. even unworkable. Solution Encode the media (compress it) so that we will lose some data. but the quality will remain good. Provide decoder (uncompress) to provide the media in good quality.
Introducing codecs
The compressor/decompressor pair is called codec (short for coder/decoder) Normally we will have separate codecs for video and audio. Most codecs are lossy meaning we lose a little quality in the encoding process. some codecs are lossless.
Modern codecs
Video : MPEG2, MPEG4 (divx, microsoft, xvid), H.264 - can be compressed by 1:50-1:5000 depending on 1:50codec and quality. Audio : MP3, Vorbis, AAC, WMA (good quality audio can be compressed by 1:5-1:50) 1:5-
Decoding streams.
FFMPEG
We will be using ffmpeg library to encode and decode. ffmpeg is very object oriented C library. We will be using factory and facade/Interface design patterns.
Obtaining video
I used youtube downloader. you may install youdube downloader using sudo apt-get install youtube-dl aptyoutube-
Decoding
Our first task will be decoding. We will open an AVI file of Bruce Springsteens Springsteen Outlaw Pete performance and save the first frame. Pete By the end of the class we will play the complete video.
using ffmpeg
ffmpeg is usually used as a library by media players (such as vlc) but we can also use ffplay(1) and ffmpeg(1) these files are ffmpeg test utils.
compiling
gcc decode.c -o decode -lavcodec -lavformat -lavutil (feel free to create a makefile)
1 #include <libavcodec/avcodec.h> 2 #include <libavformat/avformat.h> 3 #include <libswscale/swscale.h> 4 #include <stdio.h> 5 6 int main() 7{ 8 9} av_register_all();
Explaining
the three include files are needed for decoding... They include the ffmpeg file format library(.AVI, .MPG etc.) and the ffmpeg codec. The av_register_all() is the factory initialization method
The include paths in XUbuntu are as shown in my slides and not as in the manual!!!
So what is Factory
Factory o j ct co tai tho s.
r ist r all - which r ist r co stractor of all o j cts a associat ach o j ct with a y
s yo th ri ht o j ct as to r co w co c as il o ly o y
w o j cts yo tho s
ro ra s ca as for
And don
// decode1.c (based on tutorial1.c in the manual!) 9 AVFormatContext *pFormatCtx; 10 int i, videoStream; 11 AVCodecContext *pCodecCtx; 12 AVCodec *pCodec; 13 AVFrame *pFrame; 14 AVFrame *pFrameRGB; 15 AVPacket packet; 16 int frameFinished; 17 int numBytes; 18 uint8_t *buffer; 19 20 if(argc < 2) { 21 printf("Please provide a movie file\n"); 22 file\ return -1; 23 }
Don Dont try to understand We will use the variables later All that matters is we get the movie filename on the first argument
// open the video file 26 if(av_open_input_file(&pFormatCtx, argv[1], NULL, 0, NUL for(i=0; i<pFormatCtx->nb_streams; i++) 39 i<pFormatCtxif(pFormatCtx->streams[i]->codec-> if(pFormatCtx->streams[i]->codec-
// Decode2.c (based on tutorial1 still) 46 // Get a pointer to the codec context for the video stream 47 pCodecCtx=pFormatCtxpCodecCtx=pFormatCtx>streams[videoStream]>streams[videoStream]->codec; 48 49 // Find the decoder for the video stream 50 pCodec=avcodec_find_decoder(pCodecCtxpCodec=avcodec_find_decoder(pCodecCtx->codec_id); 51 if(pCodec==NULL) { 52 fprintf(stderr, "Unsupported codec! \n"); 53 codec!\ return -1; // Codec not found 54 } 55 // Open codec 56 if(avcodec_open(pCodecCtx, pCodec)<0) 57 return -1; // Could not open codec 58 59 // Allocate video frame 60 pFrame=avcodec_alloc_frame(); 61 if(pFrame==NULL) 62 return -1; 63 64 // Allocate an AVFrame structure 65 pFrameRGB=avcodec_alloc_frame(); 66 if(pFrameRGB==NULL)
Explaining decode2.c
We requested codec from the factory based on the context (the key) Then we called the codec constructor Last we init two frames.
69 // Determine required buffer size and allocate buffer 70 numBytes=avpicture_get_size(PIX_FMT_RGB24, pCodecCtx->width, 71 pCodecCtxpCodecCtxpCodecCtx->height); 72 buffer=(uint8_t *)av_malloc(numBytes*sizeof(uint8_t)); 73 74 // Assign appropriate parts of buffer to image planes in pFrameRGB 75 // Note that pFrameRGB is an AVFrame, but AVFrame is a superset 76 // of AVPicture 77 avpicture_fill((AVPicture *)pFrameRGB, buffer, PIX_FMT_RGB24, 78 pCodecCtx->width, pCodecCtx->height); pCodecCtxpCodecCtx-
80 81 82 83 84 85
while(av_read_frame(pFormatCtx, &packet)>=0) { // Is this a packet from the video stream? if(packet.stream_index==videoStream) { // Decode video frame avcodec_decode_video(pCodecCtx, pFrame, &frameFinished, packet.data, packet.size);
86 // Did we get a video frame? 87 if(frameFinished) { 88 // Convert the image from its native format to RGB 89 img_convert((AVPicture *)pFrameRGB, PIX_FMT_RGB24, 90 (AVPicture*)pFrame, pCodecCtx->pix_fmt, 91 pCodecCtxSaveFrame(pFrameRGB, pCodecCtxpCodecCtxpCodecCtxpCodecCtx->width, pCodecCtx->height); 92 pCodecCtx>width, pCodecCtx->height); 93 pCodecCtxgoto exit; 94 } 95 } 96 // Free the packet that was allocated by av_read_frame 97 av_free_packet(&packet); 98 } 99 exit:100 av_free(buffer);101 av_free(pFrameRGB);102 // Free the YUV frame103 av_free(pFrame);104 // Close the codec105 avcodec_close(pCodecCtx);106 // Close the video file107 av_close_input_file(pFormatCtx);108 return 0;109 }
Explaining...
We can read audio or video packets from the stream but we can only decode video packets with this codec. We then check if we got complete frame and not partial If we do we decode it. We later convert the frame to raw so that we can save it Last we call some distractors
Manual uses img_convert funtion which doesnt exist doesn today (We use swscale) we will demonstrate how to implement or use swscale instead of img_convert.
111 void SaveFrame(AVFrame *pFrame, int width, int height) {112 FILE *pFile;113 char Filename[32];114 int y;115 // Open file116 sprintf(Filename, "frame.ppm");117 pFile=fopen(Filename, "wb");118 if(pFile==NULL) return;119 // Write header120 fprintf(pFile, "P6\n%d %d\n255\n", width, height);121 // Write pixel data122 for(y=0; y<height; "P6\ %d\n255\ y++) fwrite(pFrame->data[0]+y*pFrame->linesize[0], 1, width*3, pFile);123 // Close file124 fwrite(pFrame->data[0]+y*pFramefclose(pFile);125 }
We save the file in PPM format. That Thats a silly format that contain simple header and the packet in raw
Implementing img_convert
This function used to be part of FFMPEG but it was removed due to licensing issues. It was replaced by swscale - a more powerful interface. We will implement img_convert using swscale.
22 struct SwsContext *img_convert_ctx; ... 81 img_convert_ctx = sws_getContext(pCodecCtx->width, 82 sws_getContext(pCodecCtxpCodecCtx->height, pCodecCtxpCodecCtx->pix_fmt,pCodecCtx->width,pCodecCtxpCodecCtx->pix_fmt,pCodecCtx->width,pCodecCtx- >height, PIX_FMT_RGB24, SWS_BILINEAR, NULL, NULL, NULL); 83 if(img_convert_ctx == NULL) { 84 fprintf(stderr, "Cannot initialize the conversion context! \n"); 85 context!\ exit(1); 86 } ... 100 if(frameFinished) {101 sws_scale(img_convert_ctx, pFrame->data,102 pFramepFramepFrame->linesize, 0,103 pCodecCtx->height, pFrameRGB->data, pFrameRGBpCodecCtxpFrameRGBpFrameRGB>linesize);104 SaveFrame(pFrameRGB, pCodecCtx->width, pCodecCtx->height) ; pCodecCtxpCodecCtx-
164
165 void img_convert(AVPicture * target , int targetFmt, AVPicture * source ,int sourceFmt,int w, int h) 166 { 167 168 169 170 static struct SwsContext *img_convert_ctx=NULL; if(img_convert_ctx == NULL) { img_convert_ctx = sws_getContext(w, h,
Another solution
171 sourceFmt, w, h, targetFmt, SWS_BICUBIC,NULL, NULL, NULL); 172 173 174 175 } sws_scale(img_convert_ctx, source->data, sourcesourcesource->linesize, 0, h, target->data, target->linesize); targettarget-
Introducing X
Relays on network to deliver changes, keystrokes, mouse movements. Exist in some forms in every UNIX host. All work in the same X protocol. Some platform also have their own improved environment. these usually run on UDS.... Microsoft have similar concept with RDP. (but not their standard interface)
Installing SDL
same steps as with ffmpeg. Start synaptic and download all libsdl and libsdl-devel libsdllibs. (there will be lots of libs and dependencies)
5 #include <SDL.h>
6 #include <SDL_thread.h> 27 28 29 30 31 32 33 34 35 36 } if(SDL_Init(SDL_INIT_VIDEO | SDL_INIT_AUDIO | SDL_INIT_TIMER)) { fprintf(stderr, "Could not initialize SDL - %s\n", SDL_GetError() %s\ exit(1); ); SDL_Overlay *bmp; SDL_Surface *screen; SDL_Rect rect; SDL_Event event;
82 83 84 85 86 88 89 90 91 92
screen = SDL_SetVideoMode(pCodecCtx->width, pCodecCtx->height, 0, 0); SDL_SetVideoMode(pCodecCtxpCodecCtxif(!screen) { fprintf(stderr, "SDL: could not set video mode - exiting\n"); exiting\ exit(1); } // Allocate a place to put our YUV image on that screen bmp = SDL_CreateYUVOverlay(pCodecCtx->width, SDL_CreateYUVOverlay(pCodecCtxpCodecCtxpCodecCtx->height, SDL_YV12_OVERLAY, screen);
117 118 119 120 121 122 123 124 125 126
if(frameFinished) { SDL_LockYUVOverlay(bmp); AVPicture pict; pict.data[0] = bmp->pixels[0]; bmppict.data[1] = bmp->pixels[2]; bmppict.data[2] = bmp->pixels[1]; bmp-
127 128 129 130 131 132 134 135 136 137 138 139 } // Convert the image into YUV format that SDL uses img_convert(&pict, PIX_FMT_YUV420P, (AVPicture *)pFrame, pCodecCtx->pix_fmt, pCodecCtxpCodecCtxpCodecCtx->width, pCodecCtx->height); pCodecCtxSDL_UnlockYUVOverlay(bmp) rect.x = 0; rect.y = 0; rect.w = pCodecCtx->width; pCodecCtxrect.h = pCodecCtx->height; pCodecCtxSDL_DisplayYUVOverlay(bmp, &rect);
165 void img_convert(AVPicture * target , int targetFmt, AVPicture * source 166 { 167 168 169 170 171 172 173 174 175 176 } } sws_scale(img_convert_ctx, source->data, sourcesourcesource->linesize, 0, h, target->data, target->linesize); targettargetstatic struct SwsContext *img_convert_ctx=NULL; if(img_convert_ctx == NULL) { img_convert_ctx = sws_getContext(w, h, sourceFmt, w, h, targetFmt, SWS_BICUBIC,NULL, NULL, NULL);
Compiling SDL
`sdl-config --cflags --libs` will output the C headers and libs to use SDL. the `s make the output part of the command line. Total compile line : gcc decode4.c -o decode4 -lavcodec lavformat -lavutil -lswscale `sdl-config --cflags --libs`
Audio
233 234 235 236 237 238 239 240 241 243 246 247 248 251 252
aCodecCtx=pFormatCtx->streams[audioStream]aCodecCtx=pFormatCtx->streams[audioStream]->codec; // Set audio settings from codec info wanted_spec.freq = aCodecCtx->sample_rate; aCodecCtxwanted_spec.format = AUDIO_S16SYS; wanted_spec.channels = aCodecCtx->channels; aCodecCtxwanted_spec.silence = 0; wanted_spec.samples = SDL_AUDIO_BUFFER_SIZE; wanted_spec.callback = audio_callback; wanted_spec.userdata = aCodecCtx; if(SDL_OpenAudio(&wanted_spec, &spec) < 0) { } aCodec = avcodec_find_decoder(aCodecCtx->codec_id); avcodec_find_decoder(aCodecCtxif(!aCodec) { } avcodec_open(aCodecCtx, aCodec);
Important
packet_queue_init(&audioq); SDL_PauseAudio(0);
SO ...
Our main thread - read audio packets put them in a queue... And SDL starts an audio thread - separate control that reads from queue
137 void audio_callback(void *userdata, Uint8 *stream, int len) { 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 while(len > 0) { if(audio_buf_index >= audio_buf_size) { /* We have already sent all our data; get more */ audio_size = audio_decode_frame(aCodecCtx, audio_buf, sizeof(audio_buf)); if(audio_size < 0) { /* If error, output silence */ audio_buf_size = 1024; memset(audio_buf, 0, audio_buf_size); static uint8_t audio_buf[(AVCODEC_MAX_AUDIO_FRAME_SIZE * 3) / 2]; static unsigned int audio_buf_size = 0; static unsigned int audio_buf_index = 0; AVCodecContext *aCodecCtx = (AVCodecContext *)userdata; int len1, audio_size;
152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 } } } }
audio_buf_index = 0;
len1 = audio_buf_size - audio_buf_index; if(len1 > len) len1 = len; memcpy(stream, (uint8_t *)audio_buf + audio_buf_index, len1); len -= len1; stream += len1; audio_buf_index += len1;
95 int audio_decode_frame(AVCodecContext *aCodecCtx, uint8_t *audio_buf, int buf_size) { 96 97 98 99 100 102 103 104 105 106 107 109 110 111 112 113 } audio_pkt_data += len1; audio_pkt_size -= len1; static AVPacket pkt; static uint8_t *audio_pkt_data = NULL; static int audio_pkt_size = 0; int len1, data_size; for(;;) { while(audio_pkt_size > 0) { data_size = buf_size; len1 = avcodec_decode_audio2(aCodecCtx, (int16_t *)audio_buf, &data_size, audio_pkt_data, audio_pkt_size); if(len1 < 0) { audio_pkt_size = 0; break;
114 116 117 118 119 120 121 123 125 126 127 128 129 130 131 132 } } } }
if(data_size <= 0) { continue; } /* We have data, return it and come back for more later */ return data_size;
if(pkt.data) av_free_packet(&pkt); if(quit) { return -1; } if(packet_queue_get(&audioq, &pkt, 1) < 0) { return -1;
In the main thread - we put audio packets in a queue In the Audio thread (SDL opened that one for us, but trust me, its there...) - we get audio packets from the it queue...
PacketQueue audioq;
int quit = 0;
MUTEX ( t r t
AVPacketList *pkt1; if(av_dup_packet(pkt) < 0) { return -1; } pkt1 = av_malloc(sizeof(AVPacketList)); if (!pkt1) return -1; pkt1->pkt = *pkt; pkt1->next = NULL;
SDL_LockMutex(q->mutex);
if (!q->last_pkt)
q->first_pkt = pkt1; else q->last_pkt->next = pkt1; q->last_pkt = pkt1; q->nb_packets++; q->size += pkt1->pkt.size; SDL_CondSignal(q->cond);
SDL_UnlockMutex(q->mutex); return 0;
t, i t l
{ A i t r t; tLi t * t ;
S L L
M t
( -
);
f r(;;) {
if( r t r }
it) { - ; ;
q->nb_packets--; q->size -= p
ikt1->pkt.size; *pkt = pkt1->pkt; av_free(pkt1); ret = 1; break; } else if (!block) { ret = 0; break; } else { SDL_CondWait(q->cond, q->mutex); } }
First thing we do
is create global variables (shared memory for the video thread.) we do it in struct videostate.
587 int main(int argc, char *argv[]) { 588 589 SDL_Event 590 591 VideoState 592 593 is = av_mallocz(sizeof(VideoState)); *is; event;
Explaining SDL_createThread()
Same as before with Mutexes- we create the thread Mutexesand use it using this function. its a wrapper to the it POSIX function. Note that here we have created the thread explicitly (for audio we relayed on SDL internal audio thread.) The SDL function get a pointer to a function - this is the main thread for the audio.
decode thread.
Examine - lines 500-580 in decode6. 500We are now reading packets using only this thread and putting what we read in two seperate queues. (one for video and one for audio with Mutex and Cond for each)
Video thread.
Examine the function : stream_component_open This function replaces the messy stream inspection we used in main. In this function we also open explicitly Video Playback thread. (With audio thread creation was implicit)
Further reading
read the tutorial atleast up to chapter 6. (needed for homework) I have created decode7.c and decode8.c for chapters 5 and 6. Check quality difference. especially in songs performed live.