Discussion:
[ovs-discuss] Warning logs flooding: failed to flow_del (No such file or directory)
Han Zhou
2014-06-11 10:54:28 UTC
Permalink
Hello folks,

We encountered a problem on a hypervisor with OVS2.0, that there are
warning logs flooding in ovs-vswitchd.log:

2014-06-10T13:25:11.761Z|15084062|dpif|WARN|Dropped 67 log messages in
last 1 seconds (most recently, 1 seconds ago) due to excessive rate
2014-06-10T13:25:11.761Z|15084063|dpif|WARN|system at ovs-system: failed
to flow_del (No such file or directory) ...

This log is printed every 1 second. And the flow do exists on the
hypervisor (shown by ovs-dpctl dump-flows).

Could someone help explain the root cause, and suggest the actions to fix?

Best regards,
Han
Han Zhou
2014-06-12 04:40:11 UTC
Permalink
Hi Alex,

I see that you encountered this error log before in your tests. Is
this a bug that's already fixed? Could you help explain and give
advice how to clean up (this is a production hypervisor and no plan to
upgrade at this moment)?

Note: this is OVS2.0. It seems OVS is retrying deleting 1 entry for 10
minutes and switch to next one. There are about 60+ such "zombie"
entries in the datapath flow table which never gets deleted and
results in the warning logs flooding.

Best regards,
Han
Post by Han Zhou
Hello folks,
We encountered a problem on a hypervisor with OVS2.0, that there are
2014-06-10T13:25:11.761Z|15084062|dpif|WARN|Dropped 67 log messages in
last 1 seconds (most recently, 1 seconds ago) due to excessive rate
2014-06-10T13:25:11.761Z|15084063|dpif|WARN|system at ovs-system: failed
to flow_del (No such file or directory) ...
This log is printed every 1 second. And the flow do exists on the
hypervisor (shown by ovs-dpctl dump-flows).
Could someone help explain the root cause, and suggest the actions to fix?
Best regards,
Han
Alex Wang
2014-06-12 05:30:03 UTC
Permalink
Hey Han,

Thanks for pointing this out,

Please see my reply inline,
Post by Han Zhou
Hi Alex,
I see that you encountered this error log before in your tests. Is
this a bug that's already fixed? Could you help explain and give
advice how to clean up (this is a production hypervisor and no plan to
upgrade at this moment)?
The issue I encountered is for ovs-2.1 and later when we start dumping
flows from datapath periodically. Basically, the warning log means that
the ovs tries to delete a non-existent flow in datapath.
Post by Han Zhou
Note: this is OVS2.0. It seems OVS is retrying deleting 1 entry for 10
minutes and switch to next one. There are about 60+ such "zombie"
entries in the datapath flow table which never gets deleted and
results in the warning logs flooding.
For OVS-2.0, we still use facet and subfacet in ofproto-dpif to keeps record
of the flows. And the flow revalidation logic is quite different from later
branches. So, the fix in later branches does not apply here.


What you saw likely indicates the following scenario: e.g.
a datapath flow could be dumped -> but ovs could not find corresponding
subfacet (due to bug) -> so, ovs deleted the datapath flow -> later when
the subfacet is to be deleted, it tries deleting the datapath flow -> but
the datapath flow does not exist and you saw the warning in log.


But, it surprised me that you saw 60+ flows in datapath that should be
deleted. Did those 'Zombie' flows get hit? Could you provide more info
about how to reproduce it?



Thanks,
Alex Wang,
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://openvswitch.org/pipermail/discuss/attachments/20140611/7fc46fe2/attachment-0001.html>
Han Zhou
2014-06-12 05:52:48 UTC
Permalink
Post by Alex Wang
For OVS-2.0, we still use facet and subfacet in ofproto-dpif to keeps record
of the flows. And the flow revalidation logic is quite different from later
branches. So, the fix in later branches does not apply here.
What was the revalidation logic in OVS2.0? Why is it retrying for 10
minutes and switch to the next one? I should check the 2.0 code myself
but just have no time yet.
Post by Alex Wang
What you saw likely indicates the following scenario: e.g.
a datapath flow could be dumped -> but ovs could not find corresponding
subfacet (due to bug) -> so, ovs deleted the datapath flow -> later when
the subfacet is to be deleted, it tries deleting the datapath flow -> but
the datapath flow does not exist and you saw the warning in log.
I can still dump the flows with ovs-dpctl dump-flows, but can not
delete them by ovs-dpctl del-flow system at ovs-system "...", the same
error (No such file or directory) returned. Would it be helpful and
safe to do ovs-dpctl del-flows (to delete all datapath flows)?
Post by Alex Wang
But, it surprised me that you saw 60+ flows in datapath that should be
deleted. Did those 'Zombie' flows get hit? Could you provide more info
about how to reproduce it?
These flows are learned from standalone bridge: they are for STT
tunnel between hypervisors (TCP dst 7471). The "used" field is
"never". The STT port was already changed long time ago, because when
I do tcpdump for port 7471 I can see the new port with the peer IP. So
these are really zombie flows. And for now I just want to clean them
up to stop the logs flooding. It is not easy to be reproduced because
I checked several other hypervisors there is no such issue.

Best regards,
Han
Alex Wang
2014-06-17 00:10:57 UTC
Permalink
Sorry for this very delayed reply,

After talking within ovs-team, we think it is best to use branch-2.1
which is LTE branch of OVS. If you still observe the issue, we will
work on a fix.

Thanks,
Alex Wang,
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://openvswitch.org/pipermail/discuss/attachments/20140616/8df5d474/attachment.html>
Alex Wang
2014-06-17 00:50:09 UTC
Permalink
Sorry for using the wrong term (LTS, not LTE) and misclaiming it
(branch-2.1
is not a LTS branch).

We just recommend you to use branch-2.1, that branch should work fine.

There is no new LTS branch for ovs.

Thanks,
Alex Wang,
Post by Alex Wang
Sorry for this very delayed reply,
After talking within ovs-team, we think it is best to use branch-2.1
which is LTE branch of OVS. If you still observe the issue, we will
work on a fix.
Thanks,
Alex Wang,
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://openvswitch.org/pipermail/discuss/attachments/20140616/50f56e79/attachment-0001.html>
Han Zhou
2014-06-21 10:34:12 UTC
Permalink
Hi Alex,

Sure, for the boxes using OVS2.1, there is no such issue.
Thanks for your help.

Best regards,
Han
Sorry for using the wrong term (LTS, not LTE) and misclaiming it (branch-2.1
is not a LTS branch).
We just recommend you to use branch-2.1, that branch should work fine.
There is no new LTS branch for ovs.
Thanks,
Alex Wang,
Post by Alex Wang
Sorry for this very delayed reply,
After talking within ovs-team, we think it is best to use branch-2.1
which is LTE branch of OVS. If you still observe the issue, we will
work on a fix.
Thanks,
Alex Wang,
Alex Wang
2014-06-30 21:36:19 UTC
Permalink
Hey Han,

We found the issue, fix has been sent here:
http://openvswitch.org/pipermail/dev/2014-June/042333.html

Thanks again for reporting,
Alex Wang,
Post by Han Zhou
Hello folks,
We encountered a problem on a hypervisor with OVS2.0, that there are
2014-06-10T13:25:11.761Z|15084062|dpif|WARN|Dropped 67 log messages in
last 1 seconds (most recently, 1 seconds ago) due to excessive rate
2014-06-10T13:25:11.761Z|15084063|dpif|WARN|system at ovs-system: failed
to flow_del (No such file or directory) ...
This log is printed every 1 second. And the flow do exists on the
hypervisor (shown by ovs-dpctl dump-flows).
Could someone help explain the root cause, and suggest the actions to fix?
Best regards,
Han
_______________________________________________
discuss mailing list
discuss at openvswitch.org
http://openvswitch.org/mailman/listinfo/discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://openvswitch.org/pipermail/discuss/attachments/20140630/fd6aa20e/attachment.html>
Han Zhou
2014-07-02 02:38:07 UTC
Permalink
Hi Alex,

That's cool!
Just one more question.
Due to the race condition in userspace, there is chance that two
overlapping megaflows could be installed in datapath. And this
causes userspace unable to delete the less inclusive megaflow flow
even after it timeout, since the flow_del logic will stop at the
first match of masked flow.
Does it mean if the more inclusive megaflow (say A) is deleted, then
the less inclusive megaflow (say B) can be deleted? If so, I can have
a workaround without updating OVS version: I can find out the more
inclusive megaflow and manually delete flows with ovs-dpctl in order
(A -> B), then warning logs should stop, right?
Hey Han,
http://openvswitch.org/pipermail/dev/2014-June/042333.html
Thanks again for reporting,
Alex Wang,
Post by Han Zhou
Hello folks,
We encountered a problem on a hypervisor with OVS2.0, that there are
2014-06-10T13:25:11.761Z|15084062|dpif|WARN|Dropped 67 log messages in
last 1 seconds (most recently, 1 seconds ago) due to excessive rate
2014-06-10T13:25:11.761Z|15084063|dpif|WARN|system at ovs-system: failed
to flow_del (No such file or directory) ...
This log is printed every 1 second. And the flow do exists on the
hypervisor (shown by ovs-dpctl dump-flows).
Could someone help explain the root cause, and suggest the actions to fix?
Best regards,
Han
_______________________________________________
discuss mailing list
discuss at openvswitch.org
http://openvswitch.org/mailman/listinfo/discuss
Alex Wang
2014-07-02 06:30:29 UTC
Permalink
Post by Han Zhou
Hi Alex,
That's cool!
Just one more question.
Due to the race condition in userspace, there is chance that two
overlapping megaflows could be installed in datapath. And this
causes userspace unable to delete the less inclusive megaflow flow
even after it timeout, since the flow_del logic will stop at the
first match of masked flow.
Does it mean if the more inclusive megaflow (say A) is deleted, then
the less inclusive megaflow (say B) can be deleted? If so, I can have
a workaround without updating OVS version: I can find out the more
inclusive megaflow and manually delete flows with ovs-dpctl in order
(A -> B), then warning logs should stop, right?
Right. Note you should prevent the relevant traffic from installing the
'more inclusive' flow between deletions.

Thanks,
Alex Wang,
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://openvswitch.org/pipermail/discuss/attachments/20140701/a6f38b81/attachment.html>
Han Zhou
2014-07-02 08:22:14 UTC
Permalink
Hi Alex,
Post by Alex Wang
Post by Han Zhou
Hi Alex,
That's cool!
Just one more question.
Due to the race condition in userspace, there is chance that two
overlapping megaflows could be installed in datapath. And this
causes userspace unable to delete the less inclusive megaflow flow
even after it timeout, since the flow_del logic will stop at the
first match of masked flow.
Does it mean if the more inclusive megaflow (say A) is deleted, then
the less inclusive megaflow (say B) can be deleted? If so, I can have
a workaround without updating OVS version: I can find out the more
inclusive megaflow and manually delete flows with ovs-dpctl in order
(A -> B), then warning logs should stop, right?
Right. Note you should prevent the relevant traffic from installing the
'more inclusive' flow between deletions.
I did find out a more inclusive mega-flow, but failed when trying to
delete that one:
# ovs-dpctl del-flow system at ovs-system
"skb_priority(0),in_port(2),eth(src=30:f7:0d:9b:64:41,dst=78:45:c4:fb:c2:2f),eth_type(0x0800),ipv4(src=10.120.139.226/0.0.0.0,dst=10.120.116.69/0.0.0.0,proto=6/0,tos=0/0,ttl=61/0,frag=no/0xff)"
2014-07-02T08:16:29Z|00001|dpif|WARN|system at ovs-system: failed to
flow_del (Invalid argument)
skb_priority(0),in_port(2),eth(src=30:f7:0d:9b:64:41,dst=78:45:c4:fb:c2:2f),eth_type(0x0800),ipv4(src=10.120.139.226,dst=10.120.116.69,proto=6,tos=0,ttl=61,frag=no)
ovs-dpctl: deleting flow (Invalid argument)

The error is now "Invalid argument". So what's wrong here?

Best regards,
Han
Alex Wang
2014-07-02 14:28:21 UTC
Permalink
The tcp field is missing, could you try 'ovs-dpctl dump-flow -m' to get
all fields. And use it as input to 'ovsdpctl flow-del'

Thanks,
Alex Wang,
Post by Han Zhou
Hi Alex,
Post by Alex Wang
Post by Han Zhou
Hi Alex,
That's cool!
Just one more question.
Due to the race condition in userspace, there is chance that two
overlapping megaflows could be installed in datapath. And this
causes userspace unable to delete the less inclusive megaflow flow
even after it timeout, since the flow_del logic will stop at the
first match of masked flow.
Does it mean if the more inclusive megaflow (say A) is deleted, then
the less inclusive megaflow (say B) can be deleted? If so, I can have
a workaround without updating OVS version: I can find out the more
inclusive megaflow and manually delete flows with ovs-dpctl in order
(A -> B), then warning logs should stop, right?
Right. Note you should prevent the relevant traffic from installing the
'more inclusive' flow between deletions.
I did find out a more inclusive mega-flow, but failed when trying to
# ovs-dpctl del-flow system at ovs-system
"skb_priority(0),in_port(2),eth(src=30:f7:0d:9b:64:41,dst=78:45:c4:fb:c2:2f),eth_type(0x0800),ipv4(src=
10.120.139.226/0.0.0.0,dst=10.120.116.69/0.0.0.0,proto=6/0,tos=0/0,ttl=61/0,frag=no/0xff
)"
2014-07-02T08:16:29Z|00001|dpif|WARN|system at ovs-system: failed to
flow_del (Invalid argument)
skb_priority(0),in_port(2),eth(src=30:f7:0d:9b:64:41,dst=78:45:c4:fb:c2:2f),eth_type(0x0800),ipv4(src=10.120.139.226,dst=10.120.116.69,proto=6,tos=0,ttl=61,frag=no)
ovs-dpctl: deleting flow (Invalid argument)
The error is now "Invalid argument". So what's wrong here?
Best regards,
Han
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://openvswitch.org/pipermail/discuss/attachments/20140702/223062c3/attachment.html>
Han Zhou
2014-07-03 07:53:54 UTC
Permalink
That's helpful. But after deleting the more inclusive flow I still
cannot delete the "zombie" flow. It might be there are still more
overlapping flows.
So, I tried ovs-dpctl del-flows, and all the "zombie" flows are gone :)
Post by Alex Wang
The tcp field is missing, could you try 'ovs-dpctl dump-flow -m' to get
all fields. And use it as input to 'ovsdpctl flow-del'
Thanks,
Alex Wang,
Post by Han Zhou
Hi Alex,
Post by Alex Wang
Post by Han Zhou
Hi Alex,
That's cool!
Just one more question.
Due to the race condition in userspace, there is chance that two
overlapping megaflows could be installed in datapath. And this
causes userspace unable to delete the less inclusive megaflow flow
even after it timeout, since the flow_del logic will stop at the
first match of masked flow.
Does it mean if the more inclusive megaflow (say A) is deleted, then
the less inclusive megaflow (say B) can be deleted? If so, I can have
a workaround without updating OVS version: I can find out the more
inclusive megaflow and manually delete flows with ovs-dpctl in order
(A -> B), then warning logs should stop, right?
Right. Note you should prevent the relevant traffic from installing the
'more inclusive' flow between deletions.
I did find out a more inclusive mega-flow, but failed when trying to
# ovs-dpctl del-flow system at ovs-system
"skb_priority(0),in_port(2),eth(src=30:f7:0d:9b:64:41,dst=78:45:c4:fb:c2:2f),eth_type(0x0800),ipv4(src=10.120.139.226/0.0.0.0,dst=10.120.116.69/0.0.0.0,proto=6/0,tos=0/0,ttl=61/0,frag=no/0xff)"
2014-07-02T08:16:29Z|00001|dpif|WARN|system at ovs-system: failed to
flow_del (Invalid argument)
skb_priority(0),in_port(2),eth(src=30:f7:0d:9b:64:41,dst=78:45:c4:fb:c2:2f),eth_type(0x0800),ipv4(src=10.120.139.226,dst=10.120.116.69,proto=6,tos=0,ttl=61,frag=no)
ovs-dpctl: deleting flow (Invalid argument)
The error is now "Invalid argument". So what's wrong here?
Best regards,
Han
Alex Wang
2014-07-03 15:42:22 UTC
Permalink
I'm not exactly sure about the dp flow you have... but yes, del-flows
will purge all flows.

Also, if you stop to traffic that hitting the 'more inclusive' flow, then,
both flows ('inclusive' and 'phantom') should be deleted.

Thanks,
Alex Wang,
Post by Han Zhou
That's helpful. But after deleting the more inclusive flow I still
cannot delete the "zombie" flow. It might be there are still more
overlapping flows.
So, I tried ovs-dpctl del-flows, and all the "zombie" flows are gone :)
Post by Alex Wang
The tcp field is missing, could you try 'ovs-dpctl dump-flow -m' to get
all fields. And use it as input to 'ovsdpctl flow-del'
Thanks,
Alex Wang,
Post by Han Zhou
Hi Alex,
Post by Alex Wang
Post by Han Zhou
Hi Alex,
That's cool!
Just one more question.
Due to the race condition in userspace, there is chance that two
overlapping megaflows could be installed in datapath. And this
causes userspace unable to delete the less inclusive megaflow
flow
Post by Alex Wang
Post by Han Zhou
Post by Alex Wang
Post by Han Zhou
even after it timeout, since the flow_del logic will stop at the
first match of masked flow.
Does it mean if the more inclusive megaflow (say A) is deleted, then
the less inclusive megaflow (say B) can be deleted? If so, I can have
a workaround without updating OVS version: I can find out the more
inclusive megaflow and manually delete flows with ovs-dpctl in order
(A -> B), then warning logs should stop, right?
Right. Note you should prevent the relevant traffic from installing
the
Post by Alex Wang
Post by Han Zhou
Post by Alex Wang
'more inclusive' flow between deletions.
I did find out a more inclusive mega-flow, but failed when trying to
# ovs-dpctl del-flow system at ovs-system
"skb_priority(0),in_port(2),eth(src=30:f7:0d:9b:64:41,dst=78:45:c4:fb:c2:2f),eth_type(0x0800),ipv4(src=
10.120.139.226/0.0.0.0,dst=10.120.116.69/0.0.0.0,proto=6/0,tos=0/0,ttl=61/0,frag=no/0xff
)"
Post by Alex Wang
Post by Han Zhou
2014-07-02T08:16:29Z|00001|dpif|WARN|system at ovs-system: failed to
flow_del (Invalid argument)
skb_priority(0),in_port(2),eth(src=30:f7:0d:9b:64:41,dst=78:45:c4:fb:c2:2f),eth_type(0x0800),ipv4(src=10.120.139.226,dst=10.120.116.69,proto=6,tos=0,ttl=61,frag=no)
Post by Alex Wang
Post by Han Zhou
ovs-dpctl: deleting flow (Invalid argument)
The error is now "Invalid argument". So what's wrong here?
Best regards,
Han
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://openvswitch.org/pipermail/discuss/attachments/20140703/c2f76e95/attachment.html>
Andrey Korolyov
2014-10-03 23:20:58 UTC
Permalink
Post by Alex Wang
I'm not exactly sure about the dp flow you have... but yes, del-flows
will purge all flows.
Also, if you stop to traffic that hitting the 'more inclusive' flow, then,
both flows ('inclusive' and 'phantom') should be deleted.
Thanks,
Alex Wang,
Post by Han Zhou
That's helpful. But after deleting the more inclusive flow I still
cannot delete the "zombie" flow. It might be there are still more
overlapping flows.
So, I tried ovs-dpctl del-flows, and all the "zombie" flows are gone :)
Post by Alex Wang
The tcp field is missing, could you try 'ovs-dpctl dump-flow -m' to get
all fields. And use it as input to 'ovsdpctl flow-del'
Thanks,
Alex Wang,
Post by Han Zhou
Hi Alex,
Post by Alex Wang
Post by Han Zhou
Hi Alex,
That's cool!
Just one more question.
Due to the race condition in userspace, there is chance that two
overlapping megaflows could be installed in datapath. And this
causes userspace unable to delete the less inclusive megaflow flow
even after it timeout, since the flow_del logic will stop at the
first match of masked flow.
Does it mean if the more inclusive megaflow (say A) is deleted, then
the less inclusive megaflow (say B) can be deleted? If so, I can have
a workaround without updating OVS version: I can find out the more
inclusive megaflow and manually delete flows with ovs-dpctl in order
(A -> B), then warning logs should stop, right?
Right. Note you should prevent the relevant traffic from installing the
'more inclusive' flow between deletions.
I did find out a more inclusive mega-flow, but failed when trying to
# ovs-dpctl del-flow system at ovs-system
"skb_priority(0),in_port(2),eth(src=30:f7:0d:9b:64:41,dst=78:45:c4:fb:c2:2f),eth_type(0x0800),ipv4(src=10.120.139.226/0.0.0.0,dst=10.120.116.69/0.0.0.0,proto=6/0,tos=0/0,ttl=61/0,frag=no/0xff)"
2014-07-02T08:16:29Z|00001|dpif|WARN|system at ovs-system: failed to
flow_del (Invalid argument)
skb_priority(0),in_port(2),eth(src=30:f7:0d:9b:64:41,dst=78:45:c4:fb:c2:2f),eth_type(0x0800),ipv4(src=10.120.139.226,dst=10.120.116.69,proto=6,tos=0,ttl=61,frag=no)
ovs-dpctl: deleting flow (Invalid argument)
The error is now "Invalid argument". So what's wrong here?
Best regards,
Han
_______________________________________________
discuss mailing list
discuss at openvswitch.org
http://openvswitch.org/mailman/listinfo/discuss
Hello,

getting the lookalike message on 2.1.3 where the problem described in
the conversation should not take a place:

2014-10-03T17:24:30.429Z|00026|dpif(revalidator_11)|WARN|system at ovs-system:
failed to flow_del (No such file or directory)
skb_priority(0),in_port(2),skb_mark(0),eth(src=xx:xx:xx:xx:xx:xx,dst=yy:yy:yy:yy:yy:yy),eth_type(0x0800),ipv4(src=zz.zz.zz.zz,dst=ii.jj.kk.ll,proto=6,tos=0,ttl=118,frag=no),tcp(src=nnn,dst=mmm),tcp_flags(0x018)


Preconditions are following:
- 30k installed non-overlapping ip mask flows, ttl=4h,
- 3k dp flows,
- moderate traffic, on about 200mbit and hitting 2000prefixes/s.

The error appears long before flow` expiration time, so I think that
there is something wrong. Also even with very low flow_add ratio (~
5/s) I am getting moderately high cpu consumption by vswitchd with
such a setup, on about 50% of single Xeon core, comparing to 20% for
proactive-only match for same traffic with small number of installed
rules (~100).
Alex Wang
2014-10-04 01:39:55 UTC
Permalink
Hey Andrey,

Thanks for reporting the issue, Please see my reply below,

Hello,
Post by Andrey Korolyov
getting the lookalike message on 2.1.3 where the problem described in
Yes, v2.1.3 contains the fix commit 3601bd879 (datapath: Use exact lookup
for flow_get and flow_del.)
Post by Andrey Korolyov
2014-10-03T17:24:30.429Z|00026|dpif(revalidator_11)|WARN|system at ovs-system
failed to flow_del (No such file or directory)
skb_priority(0),in_port(2),skb_mark(0),eth(src=xx:xx:xx:xx:xx:xx,dst=yy:yy:yy:yy:yy:yy),eth_type(0x0800),ipv4(src=zz.zz.zz.zz,dst=ii.jj.kk.ll,proto=6,tos=0,ttl=118,frag=no),tcp(src=nnn,dst=mmm),tcp_flags(0x018)
Could you help confirm the following things?

1. if the code you use contains commit 3601bd879

2. for the flows in 'failed to flow_del' logs, were they the same flow or
same set of flows? or totally random flows?

One thing comes to my mind is that, for v2.1.3, the same dp flow could
actually be dumped more than once. since the flow dumping happens in
batch, so for example assume the current batch dumped flow A, then the flow
adds before the next dump move back the position of flow A in the link
list, then the next dump will dump flow A again.

So, the revalidator will process flow A twice. And if the first process
decides to delete flow A. The next process will do the same thing but
result in this 'failed to flow_del' error~


3. don't know if it is possible for your to provide the result of following
commands:

ovs-dpctl dump-flows | sed -n 's/^.*\(used:[0-9.]\+\).*$/\1/p'
Post by Andrey Korolyov
- 30k installed non-overlapping ip mask flows, ttl=4h,
- 3k dp flows,
- moderate traffic, on about 200mbit and hitting 2000prefixes/s.
The error appears long before flow` expiration time, so I think that
Post by Andrey Korolyov
there is something wrong. Also even with very low flow_add ratio (~
5/s) I am getting moderately high cpu consumption by vswitchd with
such a setup, on about 50% of single Xeon core, comparing to 20% for
proactive-only match for same traffic with small number of installed
rules (~100).
For 'flow_add', do you mean OpenFlow flows? If so, i could more time to
add/del flows in classifier,

Do you have more interfaces as compared to the other setup you mentioned?
With more interfaces, it would take more time to update stats/status.

Thanks,
Alex Wang,
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://openvswitch.org/pipermail/discuss/attachments/20141003/fc5a9dc9/attachment-0001.html>
Andrey Korolyov
2014-10-08 08:31:14 UTC
Permalink
Thanks Alex,
Post by Alex Wang
Hey Andrey,
Thanks for reporting the issue, Please see my reply below,
Post by Andrey Korolyov
Hello,
getting the lookalike message on 2.1.3 where the problem described in
Yes, v2.1.3 contains the fix commit 3601bd879 (datapath: Use exact lookup
for flow_get and flow_del.)
Sorry, it turns out that the datapath I am using is a pre-release
2.1.3 from early May, will update and report soon.
Post by Alex Wang
Post by Andrey Korolyov
failed to flow_del (No such file or directory)
skb_priority(0),in_port(2),skb_mark(0),eth(src=xx:xx:xx:xx:xx:xx,dst=yy:yy:yy:yy:yy:yy),eth_type(0x0800),ipv4(src=zz.zz.zz.zz,dst=ii.jj.kk.ll,proto=6,tos=0,ttl=118,frag=no),tcp(src=nnn,dst=mmm),tcp_flags(0x018)
Could you help confirm the following things?
1. if the code you use contains commit 3601bd879
2. for the flows in 'failed to flow_del' logs, were they the same flow or
same set of flows? or totally random flows?
Looks like they are random, though occurences are very rare (about 1
for 10k OpenFlow flow_adds).
Post by Alex Wang
One thing comes to my mind is that, for v2.1.3, the same dp flow could
actually be dumped more than once. since the flow dumping happens in batch,
so for example assume the current batch dumped flow A, then the flow adds
before the next dump move back the position of flow A in the link list, then
the next dump will dump flow A again.
So, the revalidator will process flow A twice. And if the first process
decides to delete flow A. The next process will do the same thing but
result in this 'failed to flow_del' error~
3. don't know if it is possible for your to provide the result of following
ovs-dpctl dump-flows | sed -n 's/^.*\(used:[0-9.]\+\).*$/\1/p'
Post by Andrey Korolyov
- 30k installed non-overlapping ip mask flows, ttl=4h,
- 3k dp flows,
- moderate traffic, on about 200mbit and hitting 2000prefixes/s.
The error appears long before flow` expiration time, so I think that
there is something wrong. Also even with very low flow_add ratio (~
5/s) I am getting moderately high cpu consumption by vswitchd with
such a setup, on about 50% of single Xeon core, comparing to 20% for
proactive-only match for same traffic with small number of installed
rules (~100).
For 'flow_add', do you mean OpenFlow flows? If so, i could more time to
add/del flows in classifier,
Do you have more interfaces as compared to the other setup you mentioned?
With more interfaces, it would take more time to update stats/status.
Thanks,
Alex Wang,
Andrey Korolyov
2014-10-15 14:51:07 UTC
Permalink
Hi Alex,

during transition from 2.1.3 to the 2.3.1 (both userspace and kernel
module) the 'first ping lost' appeared with completely static reactive
flows, so I am still investigating it, because this behavior can
affect flow lifecycle with mixed proactive+reactive scheme. If anyone
is interested in some kind of debugging output for this exact issue,
please ask me. Hopefully I`ll be able to fix this soon and will report
on the original issue.
Alex Wang
2014-10-15 16:20:07 UTC
Permalink
Hey Andrey,
Post by Han Zhou
Hi Alex,
during transition from 2.1.3 to the 2.3.1 (both userspace and kernel
module) the 'first ping lost' appeared with completely static reactive
flows, so I am still investigating it,
Could you provide more info on when you did ping? (i.e. after upgraded
both userspace and kernel module, or after upgraded just one of them),
Post by Han Zhou
because this behavior can
affect flow lifecycle with mixed proactive+reactive scheme. If anyone
is interested in some kind of debugging output for this exact issue,
please ask me. Hopefully I`ll be able to fix this soon and will report
on the original issue.
Feel free to post any debug info,
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://openvswitch.org/pipermail/discuss/attachments/20141015/2aa97649/attachment.html>
Loading...