From erwanaliasr1 at gmail.com Mon Nov 12 16:54:17 2012 From: erwanaliasr1 at gmail.com (Erwan Velu) Date: Mon, 12 Nov 2012 16:54:17 +0100 Subject: [Udpcast] Assert Failed in fakeSlice Message-ID: Hi all, On a very weak system, while doing lots of IO, udp-receiver ends by crashing with : udp-receiver: receivedata.c:324: fakeSliceComplete: Assertion `slice != ((void *)0)' failed. Any ideas of what can generate such assertion ? That's pretty easy to reproduce here. Cheers, -------------- next part -------------- An HTML attachment was scrubbed... URL: From erwanaliasr1 at gmail.com Tue Nov 13 10:56:26 2012 From: erwanaliasr1 at gmail.com (Erwan Velu) Date: Tue, 13 Nov 2012 10:56:26 +0100 Subject: [Udpcast] Udpcast can lock if too much IOs Message-ID: Dear All, I'm been spotting a serious issue we do have in production. We are running in async mode with a fixed bandwidth and a fec 8x8 If the storage device gets lower than the network bandwith then after a while udpcast locks itself. I've been enabling the debug and discovered that the free blocks is loosing 1 byte very often (surely links to the FEC). Once the free blocks is set to 1 the process is totally locks. Aside this bug, I found that udpcast sends more the same data several time while it got sent once. This is maybe related. Any thoughts on this ? Thanks, 17:01:26.073493 free blocks: got 4096 bytes 17:01:37.179820 free blocks: got 4095 bytes 17:01:37.180049 free blocks: got 4094 bytes 17:01:37.180210 free blocks: got 4093 bytes 17:01:37.180357 free blocks: got 4092 bytes 17:01:37.180502 free blocks: got 4091 bytes 17:01:37.180687 free blocks: got 4090 bytes 17:01:37.180834 free blocks: got 4089 bytes 17:01:37.180980 free blocks: got 4088 bytes [...] 17:02:04.364385 free blocks: got 12 bytes 17:02:04.364595 free blocks: got 11 bytes 17:02:04.364794 free blocks: got 10 bytes 17:02:04.365114 free blocks: got 9 bytes 17:02:04.365310 free blocks: got 8 bytes 17:02:04.365544 free blocks: got 7 bytes 17:02:04.365739 free blocks: got 6 bytes 17:02:04.365979 free blocks: got 5 bytes 17:02:04.366247 free blocks: got 4 bytes 17:02:04.366472 free blocks: got 3 bytes 17:02:04.366687 free blocks: got 2 bytes 17:02:04.572361 free blocks: got 1 bytes -------------- next part -------------- An HTML attachment was scrubbed... URL: From erwanaliasr1 at gmail.com Tue Nov 13 18:07:21 2012 From: erwanaliasr1 at gmail.com (Erwan Velu) Date: Tue, 13 Nov 2012 18:07:21 +0100 Subject: [Udpcast] Udpcast can lock if too much IOs In-Reply-To: References: Message-ID: In fact, we are receiving much more packets that we are able to process in the same time. This is leading to having not a single block free to get new packets and we get stuck in this position. I don't find any test that avoid getting into this situation. Shouldn't drop the incoming packets if we can store them inside the structure while processing the already received packets ? I'm working on this part but that's frankly hard to understand the code without any documentation and very very few comments inside the code. Is there any public svn | git of this project ? Maybe you have some patches on this topic. My 2cents, Erwan 2012/11/13 Erwan Velu > Dear All, > > I'm been spotting a serious issue we do have in production. > > We are running in async mode with a fixed bandwidth and a fec 8x8 > > If the storage device gets lower than the network bandwith then after a > while udpcast locks itself. > > I've been enabling the debug and discovered that the free blocks is > loosing 1 byte very often (surely links to the FEC). > > Once the free blocks is set to 1 the process is totally locks. > > Aside this bug, I found that udpcast sends more the same data several time > while it got sent once. This is maybe related. > > Any thoughts on this ? > > Thanks, > > 17:01:26.073493 free blocks: got 4096 bytes > 17:01:37.179820 free blocks: got 4095 bytes > 17:01:37.180049 free blocks: got 4094 bytes > 17:01:37.180210 free blocks: got 4093 bytes > 17:01:37.180357 free blocks: got 4092 bytes > 17:01:37.180502 free blocks: got 4091 bytes > 17:01:37.180687 free blocks: got 4090 bytes > 17:01:37.180834 free blocks: got 4089 bytes > 17:01:37.180980 free blocks: got 4088 bytes > [...] > 17:02:04.364385 free blocks: got 12 bytes > 17:02:04.364595 free blocks: got 11 bytes > 17:02:04.364794 free blocks: got 10 bytes > 17:02:04.365114 free blocks: got 9 bytes > 17:02:04.365310 free blocks: got 8 bytes > 17:02:04.365544 free blocks: got 7 bytes > 17:02:04.365739 free blocks: got 6 bytes > 17:02:04.365979 free blocks: got 5 bytes > 17:02:04.366247 free blocks: got 4 bytes > 17:02:04.366472 free blocks: got 3 bytes > 17:02:04.366687 free blocks: got 2 bytes > 17:02:04.572361 free blocks: got 1 bytes > -------------- next part -------------- An HTML attachment was scrubbed... URL: From erwanaliasr1 at gmail.com Wed Nov 14 20:07:58 2012 From: erwanaliasr1 at gmail.com (Erwan Velu) Date: Wed, 14 Nov 2012 20:07:58 +0100 Subject: [Udpcast] Solving deadlock Message-ID: Hi there, After a few email on the mailing list, I found the bug. In fact, when FEC blocks are done they are never removed from the freeBlocks leading at the end of a shortage of freeBlocks causing the deadlock. So IFAIK, the recent releases are affected when activating the FEC. As the releasing code was almost existing, I've been making it generic. I don't know if it's the cleaner way to solve it but at least, that solved my deadlock issue. diff --git a/receivedata.c b/receivedata.c index 3f213d5..692260d 100644 --- a/receivedata.c +++ b/receivedata.c @@ -135,6 +135,8 @@ struct clientState { #endif }; +static void freeFecBlocks(struct clientState *clst, slice_t slice); + static void printMissedBlockMap(struct clientState *clst, slice_t slice) { int i, first=1; @@ -460,7 +462,11 @@ static void cleanupSlices(struct clientState *clst, unsigned int doneState) clst->slices[pos].sliceNo, pos, &clst->slices[pos]); #endif pc_produce(clst->free_slices_pc, 1); - + + if (doneState == SLICE_FEC_DONE) { + freeFecBlocks(clst,slice); + } + /* if at end, exit this thread */ if(!bytes) { clst->endReached = 2; @@ -586,6 +592,23 @@ static void fec_decode_one_stripe(struct clientState *clst, } +static void freeFecBlocks(struct clientState *clst, slice_t slice) { + int stripes = slice->fec_stripes; + struct fec_desc *fec_descs = slice->fec_descs; + int stripe; + for(stripe=0; stripemissing_data_blocks[stripe] >= + slice->fec_blocks[stripe]); + for(i=0; ifec_blocks[stripe]; i++) { + if (fec_descs[stripe+i*stripes].adr != NULL) { + freeBlockSpace(clst,fec_descs[stripe+i*stripes].adr); + fec_descs[stripe+i*stripes].adr=0; + } + } + } +} + static THREAD_RETURN fecMain(void *args0) { struct clientState *clst = (struct clientState *) args0; @@ -627,15 +650,7 @@ static THREAD_RETURN fecMain(void *args0) } slice->state = SLICE_FEC_DONE; - for(stripe=0; stripemissing_data_blocks[stripe] >= - slice->fec_blocks[stripe]); - for(i=0; ifec_blocks[stripe]; i++) { - freeBlockSpace(clst,fec_descs[stripe+i*stripes].adr); - fec_descs[stripe+i*stripes].adr=0; - } - } + freeFecBlocks(clst,slice); } else if(slice->state == SLICE_DONE) { slice->state = SLICE_FEC_DONE; } -- 1.7.2.5 -------------- next part -------------- An HTML attachment was scrubbed... URL: From vergueira.marco at gmail.com Mon Nov 26 16:26:40 2012 From: vergueira.marco at gmail.com (Marco Vergueira) Date: Mon, 26 Nov 2012 15:26:40 +0000 (UTC) Subject: [Udpcast] Invitation to connect on LinkedIn Message-ID: <1977420870.20590427.1353943600868.JavaMail.app@ela4-app2319.prod> LinkedIn ------------ Eu gostaria de adicioná-lo à minha rede profissional no LinkedIn. -Marco Marco Vergueira Estudante na Instituto Superior de Engenharia do Porto Porto Area, Portugal Confirm that you know Marco Vergueira: https://www.linkedin.com/e/-4j6c3z-h9zr28gq-5c/isd/9761185133/OImb5M4r/?hs=false&tok=2STv-Vm0RukRw1 -- You are receiving Invitation to Connect emails. Click to unsubscribe: http://www.linkedin.com/e/-4j6c3z-h9zr28gq-5c/uM-RQ0hecpKzmvwLIeNfLck0TO58-zrJHBLSEi/goo/udpcast%40udpcast%2Elinux%2Elu/20061/I3259124648_1/?hs=false&tok=3zhG7zkvpukRw1 (c) 2012 LinkedIn Corporation. 2029 Stierlin Ct, Mountain View, CA 94043, USA. -------------- next part -------------- An HTML attachment was scrubbed... URL: